[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-06-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=453168&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-453168
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 01/Jul/20 00:31
Start Date: 01/Jul/20 00:31
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #977:
URL: https://github.com/apache/hive/pull/977


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 453168)
Time Spent: 2h 20m  (was: 2h 10m)

> Checkpointing for repl dump incremental phase
> -
>
> Key: HIVE-23040
> URL: https://issues.apache.org/jira/browse/HIVE-23040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aasha Medhi
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23040.01.patch, HIVE-23040.02.patch, 
> HIVE-23040.03.patch, HIVE-23040.04.patch, HIVE-23040.05.patch, 
> HIVE-23040.06.patch, HIVE-23040.06.patch, HIVE-23040.07.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=450095&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450095
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:24
Start Date: 24/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #977:
URL: https://github.com/apache/hive/pull/977#issuecomment-648503943


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450095)
Time Spent: 2h 10m  (was: 2h)

> Checkpointing for repl dump incremental phase
> -
>
> Key: HIVE-23040
> URL: https://issues.apache.org/jira/browse/HIVE-23040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aasha Medhi
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23040.01.patch, HIVE-23040.02.patch, 
> HIVE-23040.03.patch, HIVE-23040.04.patch, HIVE-23040.05.patch, 
> HIVE-23040.06.patch, HIVE-23040.06.patch, HIVE-23040.07.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=426164&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426164
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 22/Apr/20 13:59
Start Date: 22/Apr/20 13:59
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r413007694



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
##
@@ -438,16 +450,24 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 String dbName = (null != work.dbNameOrPattern && 
!work.dbNameOrPattern.isEmpty())
 ? work.dbNameOrPattern
 : "?";
-int maxEventLimit = work.maxEventLimit();
 replLogger = new IncrementalDumpLogger(dbName, dumpRoot.toString(),
 evFetcher.getDbNotificationEventsCount(work.eventFrom, dbName, 
work.eventTo, maxEventLimit),
 work.eventFrom, work.eventTo, maxEventLimit);
 replLogger.startLog();
+long dumpedCount = resumeFrom - work.eventFrom;
+if (dumpedCount > 0) {
+  LOG.info("Event id {} to {} are already dumped, skipping {} events", 
work.eventFrom, resumeFrom, dumpedCount);
+}
+cleanFailedEventDirIfExists(dumpRoot, resumeFrom);
 while (evIter.hasNext()) {
   NotificationEvent ev = evIter.next();
   lastReplId = ev.getEventId();
   Path evRoot = new Path(dumpRoot, String.valueOf(lastReplId));

Review comment:
   Path creation is not needed for the skipped events. Can be done only for 
the new events





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426164)
Time Spent: 2h  (was: 1h 50m)

> Checkpointing for repl dump incremental phase
> -
>
> Key: HIVE-23040
> URL: https://issues.apache.org/jira/browse/HIVE-23040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aasha Medhi
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23040.01.patch, HIVE-23040.02.patch, 
> HIVE-23040.03.patch, HIVE-23040.04.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=426163&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426163
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 22/Apr/20 13:57
Start Date: 22/Apr/20 13:57
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r413006043



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
##
@@ -425,11 +432,16 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 EventUtils.MSClientNotificationFetcher evFetcher
 = new EventUtils.MSClientNotificationFetcher(hiveDb);
 
+
+int maxEventLimit  = getMaxEventAllowed(work.maxEventLimit());
 EventUtils.NotificationEventIterator evIter = new 
EventUtils.NotificationEventIterator(
-evFetcher, work.eventFrom, work.maxEventLimit(), evFilter);
+evFetcher, work.eventFrom, maxEventLimit, evFilter);

Review comment:
   If maxEventLimit is not used from repldumpwork, it can be removed from 
there





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426163)
Time Spent: 1h 50m  (was: 1h 40m)

> Checkpointing for repl dump incremental phase
> -
>
> Key: HIVE-23040
> URL: https://issues.apache.org/jira/browse/HIVE-23040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aasha Medhi
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23040.01.patch, HIVE-23040.02.patch, 
> HIVE-23040.03.patch, HIVE-23040.04.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=426162&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426162
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 22/Apr/20 13:56
Start Date: 22/Apr/20 13:56
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r413005203



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
##
@@ -511,6 +539,71 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 return lastReplId;
   }
 
+  private int getMaxEventAllowed(int currentEventMaxLimit) {
+int maxDirItems = 
Integer.parseInt(conf.get(ReplUtils.DFS_MAX_DIR_ITEMS_CONFIG, "0"));
+if (maxDirItems > 0) {
+  maxDirItems = maxDirItems - ReplUtils.RESERVED_DIR_ITEMS_COUNT;
+  if (maxDirItems < currentEventMaxLimit) {
+LOG.warn("Changing the maxEventLimit from {} to {} as the '" + 
ReplUtils.DFS_MAX_DIR_ITEMS_CONFIG
++ "' limit encountered. Set this config appropriately 
to increase the maxEventLimit",
+currentEventMaxLimit, maxDirItems);
+currentEventMaxLimit = maxDirItems;
+  }
+}
+return currentEventMaxLimit;
+  }
+
+  private void cleanFailedEventDirIfExists(Path dumpDir, long resumeFrom) 
throws IOException {
+Path nextEventRoot = new Path(dumpDir, String.valueOf(resumeFrom + 1));
+Retry retriable = new Retry(IOException.class) {
+  @Override
+  public Void execute() throws IOException {
+FileSystem fs = FileSystem.get(nextEventRoot.toUri(), conf);
+if (fs.exists(nextEventRoot))  {
+  fs.delete(nextEventRoot, true);
+}
+return null;
+  }
+};
+try {
+  retriable.run();
+} catch (Exception e) {
+  throw new IOException(e);
+}
+  }
+
+  public void updateLastEventDumped(Path ackFile, long lastReplId) throws 
SemanticException {
+List> listValues = new ArrayList<>();

Review comment:
   why is this a list?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426162)
Time Spent: 1h 40m  (was: 1.5h)

> Checkpointing for repl dump incremental phase
> -
>
> Key: HIVE-23040
> URL: https://issues.apache.org/jira/browse/HIVE-23040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aasha Medhi
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23040.01.patch, HIVE-23040.02.patch, 
> HIVE-23040.03.patch, HIVE-23040.04.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=426159&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426159
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 22/Apr/20 13:36
Start Date: 22/Apr/20 13:36
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r412988341



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +670,541 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 

Review comment:
   Add a test for external table incremental checkpointing





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426159)
Time Spent: 1.5h  (was: 1h 20m)

> Checkpointing for repl dump incremental phase
> -
>
> Key: HIVE-23040
> URL: https://issues.apache.org/jira/browse/HIVE-23040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aasha Medhi
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23040.01.patch, HIVE-23040.02.patch, 
> HIVE-23040.03.patch, HIVE-23040.04.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=426093&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426093
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 22/Apr/20 09:46
Start Date: 22/Apr/20 09:46
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r412835543



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,454 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it failed.
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump1 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (1)")
+.run("insert into t2 values (2)")
+.dump(primaryDbName);
+
+Path hiveDumpDir = new Path(incrementalDump1.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+Path ackFile = new Path(hiveDumpDir, 
ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+Path ackLastEventID = new Path(hiveDumpDir, 
ReplAck.EVENTS_DUMP.toString());
+FileSystem fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+
+long firstIncEventID = Long.parseLong(bootstrapDump.lastReplicationId) + 1;
+long lastIncEventID = Long.parseLong(incrementalDump1.lastReplicationId);
+assertTrue(lastIncEventID > (firstIncEventID + 1));
+Map pathModTimeMap = new HashMap<>();
+for (long eventId=firstIncEventID; eventId<=lastIncEventID; eventId++) {
+  Path eventRoot = new Path(hiveDumpDir, String.valueOf(eventId));
+  if (fs.exists(eventRoot)) {
+for (FileStatus fileStatus: fs.listStatus(eventRoot)) {
+  pathModTimeMap.put(fileStatus.getPath(), 
fileStatus.getModificationTime());
+}
+  }
+}
+
+ReplDumpWork.testDeletePreviousDumpMetaPath(false);
+WarehouseInstance.Tuple incrementalDump2 = primary.run("use " + 
primaryDbName)
+.dump(primaryDbName);
+assertEquals(incrementalDump1.dumpLocation, incrementalDump2.dumpLocation);
+assertTrue(fs.exists(ackFile));
+//check events were not rewritten.
+for(Map.Entry entry :pathModTimeMap.entrySet()) {
+  assertEquals((long)entry.getValue(), fs.getFileStatus(new 
Path(hiveDumpDir, entry.getKey())).getModificationTime());
+}
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {"1"})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {"2"});
+
+
+//Case 2: When the last dump was half way through
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump3 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (3)")
+.run("insert into t2 values (4)")
+.dump(primaryDbName);
+
+hiveDumpDir = new Path(incrementalDump3.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+ackFile = new Path(hiveDumpDir, ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+ackLastEventID = new Path(hiveDumpDir, ReplAck.EVENTS_DUMP.toString());
+fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+//delete last three events and test if it recovers.
+long lastEventID = Long.parseLong(incrementalDump3.lastReplicationId);
+Path lastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID));
+Path secondLastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID - 1));
+Path thirdLastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID - 2));
+assertTrue(fs.exists(lastEvtRoot));
+assertTrue(fs.exists(secondLastEvtRoot));
+assertTrue(fs.exists(

[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=426072&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426072
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 22/Apr/20 08:57
Start Date: 22/Apr/20 08:57
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r412801244



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,454 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it failed.
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump1 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (1)")
+.run("insert into t2 values (2)")
+.dump(primaryDbName);
+
+Path hiveDumpDir = new Path(incrementalDump1.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+Path ackFile = new Path(hiveDumpDir, 
ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+Path ackLastEventID = new Path(hiveDumpDir, 
ReplAck.EVENTS_DUMP.toString());
+FileSystem fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+
+long firstIncEventID = Long.parseLong(bootstrapDump.lastReplicationId) + 1;
+long lastIncEventID = Long.parseLong(incrementalDump1.lastReplicationId);
+assertTrue(lastIncEventID > (firstIncEventID + 1));
+Map pathModTimeMap = new HashMap<>();
+for (long eventId=firstIncEventID; eventId<=lastIncEventID; eventId++) {
+  Path eventRoot = new Path(hiveDumpDir, String.valueOf(eventId));
+  if (fs.exists(eventRoot)) {
+for (FileStatus fileStatus: fs.listStatus(eventRoot)) {
+  pathModTimeMap.put(fileStatus.getPath(), 
fileStatus.getModificationTime());
+}
+  }
+}
+
+ReplDumpWork.testDeletePreviousDumpMetaPath(false);
+WarehouseInstance.Tuple incrementalDump2 = primary.run("use " + 
primaryDbName)
+.dump(primaryDbName);
+assertEquals(incrementalDump1.dumpLocation, incrementalDump2.dumpLocation);
+assertTrue(fs.exists(ackFile));
+//check events were not rewritten.
+for(Map.Entry entry :pathModTimeMap.entrySet()) {
+  assertEquals((long)entry.getValue(), fs.getFileStatus(new 
Path(hiveDumpDir, entry.getKey())).getModificationTime());
+}
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {"1"})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {"2"});
+
+
+//Case 2: When the last dump was half way through
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump3 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (3)")
+.run("insert into t2 values (4)")
+.dump(primaryDbName);
+
+hiveDumpDir = new Path(incrementalDump3.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+ackFile = new Path(hiveDumpDir, ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+ackLastEventID = new Path(hiveDumpDir, ReplAck.EVENTS_DUMP.toString());
+fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+//delete last three events and test if it recovers.
+long lastEventID = Long.parseLong(incrementalDump3.lastReplicationId);
+Path lastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID));
+Path secondLastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID - 1));
+Path thirdLastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID - 2));
+assertTrue(fs.exists(lastEvtRoot));
+assertTrue(fs.exists(secondLastEvtRoot));
+assertTrue(fs.exists(thirdL

[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=425396&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425396
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 20/Apr/20 17:40
Start Date: 20/Apr/20 17:40
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r411567286



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,322 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it failed.
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump1 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (1)")
+.run("insert into t2 values (2)")
+.dump(primaryDbName);
+
+Path hiveDumpDir = new Path(incrementalDump1.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+Path ackFile = new Path(hiveDumpDir, 
ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+Path ackLastEventID = new Path(hiveDumpDir, 
ReplAck.EVENTS_DUMP.toString());
+FileSystem fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+
+Map eventModTimeMap = new HashMap<>();
+long firstIncEventID = Long.parseLong(bootstrapDump.lastReplicationId) + 1;
+long lastIncEventID = Long.parseLong(incrementalDump1.lastReplicationId);
+assertTrue(lastIncEventID > (firstIncEventID + 1));
+
+for (long eventId=firstIncEventID; eventId<=lastIncEventID; eventId++) {
+  Path eventRoot = new Path(hiveDumpDir, String.valueOf(eventId));
+  if (fs.exists(eventRoot)) {
+eventModTimeMap.put(String.valueOf(eventId), 
fs.getFileStatus(eventRoot).getModificationTime());

Review comment:
   Please cross check this. Also add tests where 5 events are dumped for 
instance. Delete 2 events and keep 3. Only last 2 events should be rewritten





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425396)
Time Spent: 1h  (was: 50m)

> Checkpointing for repl dump incremental phase
> -
>
> Key: HIVE-23040
> URL: https://issues.apache.org/jira/browse/HIVE-23040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aasha Medhi
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23040.01.patch, HIVE-23040.02.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=425346&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425346
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 20/Apr/20 14:54
Start Date: 20/Apr/20 14:54
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r411444984



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,322 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it failed.
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump1 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (1)")
+.run("insert into t2 values (2)")
+.dump(primaryDbName);
+
+Path hiveDumpDir = new Path(incrementalDump1.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+Path ackFile = new Path(hiveDumpDir, 
ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+Path ackLastEventID = new Path(hiveDumpDir, 
ReplAck.EVENTS_DUMP.toString());
+FileSystem fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+
+Map eventModTimeMap = new HashMap<>();
+long firstIncEventID = Long.parseLong(bootstrapDump.lastReplicationId) + 1;
+long lastIncEventID = Long.parseLong(incrementalDump1.lastReplicationId);
+assertTrue(lastIncEventID > (firstIncEventID + 1));
+
+for (long eventId=firstIncEventID; eventId<=lastIncEventID; eventId++) {
+  Path eventRoot = new Path(hiveDumpDir, String.valueOf(eventId));
+  if (fs.exists(eventRoot)) {
+eventModTimeMap.put(String.valueOf(eventId), 
fs.getFileStatus(eventRoot).getModificationTime());
+  }
+}
+
+ReplDumpWork.testDeletePreviousDumpMetaPath(false);
+WarehouseInstance.Tuple incrementalDump2 = primary.run("use " + 
primaryDbName)
+.dump(primaryDbName);
+assertEquals(incrementalDump1.dumpLocation, incrementalDump2.dumpLocation);
+assertTrue(fs.exists(ackFile));
+//check events were not rewritten.
+for(Map.Entry entry :eventModTimeMap.entrySet()) {
+  long oldModTime = entry.getValue();
+  long newModTime = fs.getFileStatus(new Path(hiveDumpDir, 
entry.getKey())).getModificationTime();
+  assertEquals(oldModTime, newModTime);
+}
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {"1"})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {"2"});
+
+
+//Case 2: When the last dump was half way through
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump3 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (3)")
+.run("insert into t2 values (4)")
+.dump(primaryDbName);
+
+hiveDumpDir = new Path(incrementalDump3.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+ackFile = new Path(hiveDumpDir, ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+ackLastEventID = new Path(hiveDumpDir, ReplAck.EVENTS_DUMP.toString());
+fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+//delete last three events and test if it recovers.
+long lastEventID = Long.parseLong(incrementalDump3.lastReplicationId);
+Path lastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID));
+Path secondLastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID - 1));
+Path thirdLastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID - 2));
+assertTrue(fs.exists(lastEvtRoot));
+assertTrue(fs.exists(secondLastEvtRoot));
+assert

[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=425344&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425344
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 20/Apr/20 14:45
Start Date: 20/Apr/20 14:45
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r411249656



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -33,6 +33,8 @@
 import org.apache.hadoop.hive.ql.DriverFactory;

Review comment:
   Can not add regular copy one, until HIVE-23235 is fixed. 

##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -33,6 +33,8 @@
 import org.apache.hadoop.hive.ql.DriverFactory;

Review comment:
   testIncrementalDumpCheckpointing() checks for partial failure cases.

##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,322 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it failed.
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);

Review comment:
   This prevents the older valid dump directory from getting deleted to 
mimic the case when the current dump fails between 

##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,322 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it failed.
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump1 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (1)")
+.run("insert into t2 values (2)")
+.dump(primaryDbName);
+
+Path hiveDumpDir = new Path(incrementalDump1.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+Path ackFile = new Path(hiveDumpDir, 
ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+Path ackLastEventID = new Path(hiveDumpDir, 
ReplAck.EVENTS_DUMP.toString());
+FileSystem fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+
+Map eventModTimeMap = new HashMap<>();
+long firstIncEventID = Long.parseLong(bootstrapDump.lastReplicationId) + 1;
+long lastIncEventID = Long.parseLong(incrementalDump1.lastReplicationId);
+assertTrue(lastIncEventID > (firstIncEventID + 1));
+
+for (long eventId=firstIncEventID; eventId<=lastIncEventID; eventId++) {
+  Path eventRoot = new Path(hiveDumpDir, String.valueOf(eventId));
+  if (fs.exists(eventRoot)) {
+eventModTimeMap.put(String.valueOf(eventId), 
fs.getFileStatus(eventRoot).getModificationTime());
+  }
+}
+
+ReplDumpWork.testDeletePreviousDumpMetaPath(false);
+WarehouseInstance.Tuple incrementalDump2 = primary.run("use " + 
primaryDbName)
+.dump(primaryDbName);
+assertEquals(incrementalDump1.dumpLocation, incrementalDump2.dumpLocation);
+assertTrue(fs.exists(ackFile));
+//check events were not rewritten.
+for(Map.Entry entry :eventModTimeMap.entrySet()) {
+  

[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=425300&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425300
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 20/Apr/20 12:26
Start Date: 20/Apr/20 12:26
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r411335268



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,322 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it failed.
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump1 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (1)")
+.run("insert into t2 values (2)")
+.dump(primaryDbName);
+
+Path hiveDumpDir = new Path(incrementalDump1.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+Path ackFile = new Path(hiveDumpDir, 
ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+Path ackLastEventID = new Path(hiveDumpDir, 
ReplAck.EVENTS_DUMP.toString());
+FileSystem fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+
+Map eventModTimeMap = new HashMap<>();
+long firstIncEventID = Long.parseLong(bootstrapDump.lastReplicationId) + 1;
+long lastIncEventID = Long.parseLong(incrementalDump1.lastReplicationId);
+assertTrue(lastIncEventID > (firstIncEventID + 1));
+
+for (long eventId=firstIncEventID; eventId<=lastIncEventID; eventId++) {
+  Path eventRoot = new Path(hiveDumpDir, String.valueOf(eventId));
+  if (fs.exists(eventRoot)) {
+eventModTimeMap.put(String.valueOf(eventId), 
fs.getFileStatus(eventRoot).getModificationTime());
+  }
+}
+
+ReplDumpWork.testDeletePreviousDumpMetaPath(false);
+WarehouseInstance.Tuple incrementalDump2 = primary.run("use " + 
primaryDbName)
+.dump(primaryDbName);
+assertEquals(incrementalDump1.dumpLocation, incrementalDump2.dumpLocation);
+assertTrue(fs.exists(ackFile));
+//check events were not rewritten.
+for(Map.Entry entry :eventModTimeMap.entrySet()) {
+  long oldModTime = entry.getValue();
+  long newModTime = fs.getFileStatus(new Path(hiveDumpDir, 
entry.getKey())).getModificationTime();
+  assertEquals(oldModTime, newModTime);
+}
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {"1"})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {"2"});
+
+
+//Case 2: When the last dump was half way through
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump3 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (3)")
+.run("insert into t2 values (4)")
+.dump(primaryDbName);
+
+hiveDumpDir = new Path(incrementalDump3.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+ackFile = new Path(hiveDumpDir, ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+ackLastEventID = new Path(hiveDumpDir, ReplAck.EVENTS_DUMP.toString());
+fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+//delete last three events and test if it recovers.
+long lastEventID = Long.parseLong(incrementalDump3.lastReplicationId);
+Path lastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID));
+Path secondLastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID - 1));
+Path thirdLastEvtRoot = new Path(hiveDumpDir + File.separator + 
String.valueOf(lastEventID - 2));
+assertTrue(fs.exists(lastEvtRoot));
+assertTrue(fs.exists(secondLastEvtRoot));
+assertTrue(f

[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=425246&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425246
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 20/Apr/20 08:51
Start Date: 20/Apr/20 08:51
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #977:
URL: https://github.com/apache/hive/pull/977#discussion_r411194373



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,322 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it failed.
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);

Review comment:
   Why is this needed?

##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,322 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it failed.
+ReplDumpWork.testDeletePreviousDumpMetaPath(true);
+
+WarehouseInstance.Tuple incrementalDump1 = primary.run("use " + 
primaryDbName)
+.run("insert into t1 values (1)")
+.run("insert into t2 values (2)")
+.dump(primaryDbName);
+
+Path hiveDumpDir = new Path(incrementalDump1.dumpLocation, 
ReplUtils.REPL_HIVE_BASE_DIR);
+Path ackFile = new Path(hiveDumpDir, 
ReplAck.DUMP_ACKNOWLEDGEMENT.toString());
+Path ackLastEventID = new Path(hiveDumpDir, 
ReplAck.EVENTS_DUMP.toString());
+FileSystem fs = FileSystem.get(hiveDumpDir.toUri(), primary.hiveConf);
+assertTrue(fs.exists(ackFile));
+assertTrue(fs.exists(ackLastEventID));
+
+fs.delete(ackFile, false);
+
+Map eventModTimeMap = new HashMap<>();
+long firstIncEventID = Long.parseLong(bootstrapDump.lastReplicationId) + 1;
+long lastIncEventID = Long.parseLong(incrementalDump1.lastReplicationId);
+assertTrue(lastIncEventID > (firstIncEventID + 1));
+
+for (long eventId=firstIncEventID; eventId<=lastIncEventID; eventId++) {
+  Path eventRoot = new Path(hiveDumpDir, String.valueOf(eventId));
+  if (fs.exists(eventRoot)) {
+eventModTimeMap.put(String.valueOf(eventId), 
fs.getFileStatus(eventRoot).getModificationTime());

Review comment:
   Should check the individual files. The root level directory ts mayn't 
change even if contents are changing. 

##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java
##
@@ -663,6 +669,322 @@ public void testMultiDBTxn() throws Throwable {
 }
   }
 
+  @Test
+  public void testIncrementalDumpCheckpointing() throws Throwable {
+WarehouseInstance.Tuple bootstrapDump = primary.run("use " + primaryDbName)
+.run("CREATE TABLE t1(a string) STORED AS TEXTFILE")
+.run("CREATE TABLE t2(a string) STORED AS TEXTFILE")
+.dump(primaryDbName);
+
+replica.load(replicatedDbName, primaryDbName)
+.run("select * from " + replicatedDbName + ".t1")
+.verifyResults(new String[] {})
+.run("select * from " + replicatedDbName + ".t2")
+.verifyResults(new String[] {});
+
+
+//Case 1: When the last dump finished all the events and
+//only  _finished_dump file at the hiveDumpRoot was about to be written 
when it faile

[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-04-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=420288&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420288
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 10/Apr/20 16:02
Start Date: 10/Apr/20 16:02
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #977: HIVE-23040 
: Checkpointing for repl dump incremental phase
URL: https://github.com/apache/hive/pull/977
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 420288)
Remaining Estimate: 0h
Time Spent: 10m

> Checkpointing for repl dump incremental phase
> -
>
> Key: HIVE-23040
> URL: https://issues.apache.org/jira/browse/HIVE-23040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aasha Medhi
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23040-WIP.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)