[ 
https://issues.apache.org/jira/browse/HIVE-24750?focusedWorklogId=558197&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-558197
 ]

ASF GitHub Bot logged work on HIVE-24750:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Feb/21 21:18
            Start Date: 25/Feb/21 21:18
    Worklog Time Spent: 10m 
      Work Description: pkumarsinha commented on a change in pull request #1954:
URL: https://github.com/apache/hive/pull/1954#discussion_r583203211



##########
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
##########
@@ -1375,6 +1376,245 @@ public void differentCatalogIncrementalReplication() 
throws Throwable {
     primary.run("drop database if exists " + sparkDbName + " cascade");
   }
 
+  @Test
+  public void testDatabaseLevelCopyLazy() throws Throwable {
+    testDatabaseLevelCopy(true);
+  }
+
+  @Test
+  public void testDatabaseLevelCopyAtSource() throws Throwable {
+    testDatabaseLevelCopy(false);
+  }
+
+  public void testDatabaseLevelCopy(boolean runCopyTasksOnTarget)
+      throws Throwable {
+    Path externalTableLocation =
+        new Path("/" + testName.getMethodName() + "/" + primaryDbName + "/" + 
"a/");
+    DistributedFileSystem fs = primary.miniDFSCluster.getFileSystem();
+    fs.mkdirs(externalTableLocation, new FsPermission("777"));
+
+    Path externalTablePartitionLocation =
+        new Path("/" + testName.getMethodName() + "/" + primaryDbName + "/" + 
"part/");
+    fs.mkdirs(externalTableLocation, new FsPermission("777"));
+
+    List<String> withClause = Arrays.asList(
+        
"'distcp.options.update'='','hive.repl.external.warehouse.single.copy.task'='true'",
+        "'" + HiveConf.ConfVars.REPL_RUN_DATA_COPY_TASKS_ON_TARGET.varname
+            + "'='" + runCopyTasksOnTarget + "'");
+
+    // Create a table within the warehouse location, one outside and one with
+    // a partition outside the default location.
+    WarehouseInstance.Tuple tuple =
+        primary.run("use " + primaryDbName)
+            .run("create external table a (i int, j int) "
+                + "row format delimited fields terminated by ',' "
+                + "location '" + externalTableLocation.toUri() + "'")
+            .run("insert into a values(1,2)")
+            .run("create external table b (id int)")
+            .run("insert into b values(5)")
+            .run("create external table c (place string) partitioned by 
(country "
+                + "string)")
+            .run("insert into table c partition(country='india') values "
+                + "('bangalore')")
+            .run("ALTER TABLE c ADD PARTITION (country='france') LOCATION '"
+                + externalTablePartitionLocation.toString() + "'")
+            .run("insert into c partition(country='france') values('paris')")
+            .dump(primaryDbName, withClause);
+
+    Database primaryDb = primary.getDatabase(primaryDbName);
+
+    // Confirm the a table is outside the db location.
+    Table aTable = primary.getTable(primaryDbName, "a");
+    new Path(aTable.getSd().getLocation());

Review comment:
       Is this line added by mistake? You can remove this as it doesn't seem to 
be doing anything.

##########
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
##########
@@ -1375,6 +1376,245 @@ public void differentCatalogIncrementalReplication() 
throws Throwable {
     primary.run("drop database if exists " + sparkDbName + " cascade");
   }
 
+  @Test
+  public void testDatabaseLevelCopyLazy() throws Throwable {
+    testDatabaseLevelCopy(true);
+  }
+
+  @Test
+  public void testDatabaseLevelCopyAtSource() throws Throwable {
+    testDatabaseLevelCopy(false);
+  }
+
+  public void testDatabaseLevelCopy(boolean runCopyTasksOnTarget)
+      throws Throwable {
+    Path externalTableLocation =
+        new Path("/" + testName.getMethodName() + "/" + primaryDbName + "/" + 
"a/");
+    DistributedFileSystem fs = primary.miniDFSCluster.getFileSystem();
+    fs.mkdirs(externalTableLocation, new FsPermission("777"));
+
+    Path externalTablePartitionLocation =
+        new Path("/" + testName.getMethodName() + "/" + primaryDbName + "/" + 
"part/");
+    fs.mkdirs(externalTableLocation, new FsPermission("777"));
+
+    List<String> withClause = Arrays.asList(
+        
"'distcp.options.update'='','hive.repl.external.warehouse.single.copy.task'='true'",

Review comment:
       Sorry, missed it last time. Please use const. for the config. 
hive.repl.external.warehouse.single.copy.task




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 558197)
    Time Spent: 50m  (was: 40m)

> Create a single copy task for external tables within default DB location
> ------------------------------------------------------------------------
>
>                 Key: HIVE-24750
>                 URL: https://issues.apache.org/jira/browse/HIVE-24750
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ayush Saxena
>            Assignee: Ayush Saxena
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Presently we create single task for each table, but for the tables within 
> default DB location, we can copy the DB location in one task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to