[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=608310=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608310
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 06:55
Start Date: 08/Jun/21 06:55
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #2311:
URL: https://github.com/apache/hive/pull/2311#issuecomment-856505462


   Committed to master.
   
   Thanks for the patch, @hmangla98 !!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608310)
Time Spent: 5h  (was: 4h 50m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25154.patch
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=608311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608311
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 06:55
Start Date: 08/Jun/21 06:55
Worklog Time Spent: 10m 
  Work Description: pkumarsinha removed a comment on pull request #2311:
URL: https://github.com/apache/hive/pull/2311#issuecomment-856505462


   Committed to master.
   
   Thanks for the patch, @hmangla98 !!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608311)
Time Spent: 5h 10m  (was: 5h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25154.patch
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=608309=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608309
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 06:53
Start Date: 08/Jun/21 06:53
Worklog Time Spent: 10m 
  Work Description: pkumarsinha merged pull request #2311:
URL: https://github.com/apache/hive/pull/2311


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608309)
Time Spent: 4h 50m  (was: 4h 40m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25154.patch
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=608218=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608218
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 02:07
Start Date: 08/Jun/21 02:07
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r647057749



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/stats/TestStatsUpdaterThread.java
##
@@ -608,6 +607,63 @@ private void testNoStatsUpdateForReplTable(String 
tblNamePrefix, String txnPrope
 msClient.close();
   }
 
+  @Test(timeout=8)
+  public void testNoStatsUpdateForSimpleFailoverDb() throws Exception {
+testNoStatsUpdateForFailoverDb("simple", "");
+  }
+
+  @Test(timeout=8)
+  public void testNoStatsUpdateForTxnFailoverDb() throws Exception {
+testNoStatsUpdateForFailoverDb("txn",
+"TBLPROPERTIES 
(\"transactional\"=\"true\",\"transactional_properties\"=\"insert_only\")");
+  }
+
+  private void testNoStatsUpdateForFailoverDb(String tblNamePrefix, String 
txnProperty) throws Exception {
+// Set high worker count so we get a longer queue.
+
hiveConf.setInt(MetastoreConf.ConfVars.STATS_AUTO_UPDATE_WORKER_COUNT.getVarname(),
 4);
+String tblWOStats = tblNamePrefix + "_repl_failover_nostats";
+String ptnTblWOStats = tblNamePrefix + "_ptn_repl_failover_nostats";
+String dbName = ss.getCurrentDatabase();
+StatsUpdaterThread su = createUpdater();
+IMetaStoreClient msClient = new HiveMetaStoreClient(hiveConf);
+hiveConf.setBoolVar(HiveConf.ConfVars.HIVESTATSAUTOGATHER, false);
+hiveConf.setBoolVar(HiveConf.ConfVars.HIVESTATSCOLAUTOGATHER, false);
+
+executeQuery("create table " + tblWOStats + "(i int, s string) " + 
txnProperty);
+executeQuery("insert into " + tblWOStats + "(i, s) values (1, 'test')");
+verifyStatsUpToDate(tblWOStats, Lists.newArrayList("i"), msClient, false);
+
+executeQuery("create table " + ptnTblWOStats + "(s string) partitioned by 
(i int) " + txnProperty);
+executeQuery("insert into " + ptnTblWOStats + "(i, s) values (1, 'test')");
+executeQuery("insert into " + ptnTblWOStats + "(i, s) values (2, 
'test2')");
+executeQuery("insert into " + ptnTblWOStats + "(i, s) values (3, 
'test3')");
+verifyPartStatsUpToDate(3, 1, msClient, ptnTblWOStats, false);
+
+assertTrue(su.runOneIteration());
+Assert.assertEquals(2, su.getQueueLength());
+executeQuery("alter database " + dbName + " set dbproperties('" + 
ReplConst.REPL_FAILOVER_ENABLED + "'='true')");
+//StatsUpdaterThread would not run analyze commands for the tables which 
were inserted before
+//failover property was enabled for that database
+drainWorkQueue(su, 2);
+verifyStatsUpToDate(tblWOStats, Lists.newArrayList("i"), msClient, false);
+verifyPartStatsUpToDate(3, 1, msClient, ptnTblWOStats, false);
+Assert.assertEquals(0, su.getQueueLength());
+
+executeQuery("create table new_table(s string) partitioned by (i int) " + 
txnProperty);
+executeQuery("insert into new_table(i, s) values (4, 'test4')");
+
+assertFalse(su.runOneIteration());
+Assert.assertEquals(0, su.getQueueLength());
+verifyStatsUpToDate(tblWOStats, Lists.newArrayList("i"), msClient, false);
+verifyPartStatsUpToDate(3, 1, msClient, ptnTblWOStats, false);
+
+executeQuery("alter database " + dbName + " set dbproperties('" + 
ReplConst.REPL_FAILOVER_ENABLED + "'='')");
+executeQuery("drop table " + tblWOStats);

Review comment:
   Actually it would make sense to re-verify after removing this property 
REPL_FAILOVER_ENABLED that stat updation is happening 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608218)
Time Spent: 4h 40m  (was: 4.5h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=608215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608215
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 02:04
Start Date: 08/Jun/21 02:04
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r647056770



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -169,7 +168,7 @@ public void run() {
   // this always runs in 'sync' mode where partitions can be added and 
dropped
   MsckInfo msckInfo = new MsckInfo(table.getCatName(), 
table.getDbName(), table.getTableName(),
 null, null, true, true, true, retentionSeconds);
-  executorService.submit(new MsckThread(msckInfo, msckConf, 
qualifiedTableName, countDownLatch));
+  executorService.submit(new MsckThread(msckInfo, msckConf, 
qualifiedTableName, countDownLatch, msc));

Review comment:
   Sorry, looks like I missed to notice this change earlier. Sharing HMS 
client across threads might not be good idea. If all threads makes the call at 
the same time, the results might get mixed up. It would be better to use 
separate client for each of them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608215)
Time Spent: 4.5h  (was: 4h 20m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=606580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-606580
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 04/Jun/21 08:00
Start Date: 04/Jun/21 08:00
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r645236553



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -210,27 +213,34 @@ private void stopWorkers() {
 }
   }
 
-  private List processOneTable(TableName fullTableName)
+  private List processOneTable(TableName fullTableName, 
Map dbsToSkip)
   throws MetaException, NoSuchTxnException, NoSuchObjectException {
 if (isAnalyzeTableInProgress(fullTableName)) return null;
 String cat = fullTableName.getCat(), db = fullTableName.getDb(), tbl = 
fullTableName.getTable();
+String dbName = MetaStoreUtils.prependCatalogToDbName(cat,db, conf);
+if (!dbsToSkip.containsKey(dbName)) {
+  Database database = rs.getDatabase(cat, db);
+  boolean skipDb = false;
+  if (MetaStoreUtils.isDbBeingFailedOver(database)) {
+skipDb = true;
+LOG.info("Skipping all the tables which belong to database: {} as it 
is being failed over", db);
+  } else if (ReplUtils.isTargetOfReplication(database)) {

Review comment:
   We already had two separate methods declared in both ReplUtils and 
PartitionManagementTask because ReplUtils package is not accessible in 
MetastoreThreads. But it can be moved to MetastoreUtils where it can be 
accessible by both of them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 606580)
Time Spent: 4h 20m  (was: 4h 10m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=606478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-606478
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 04/Jun/21 07:48
Start Date: 04/Jun/21 07:48
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r644531296



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -625,6 +633,11 @@ public boolean runOneWorkerIteration(
 }
 String cmd = null;
 try {
+  TableName tb = req.tableName;

Review comment:
   If the very first table belongs to DbBeingFailover, it will break the 
logic for "doWait"

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
##
@@ -229,6 +231,15 @@ public static boolean isExternalTable(Table table) {
 return isExternal(params);
   }
 
+  public static boolean isDbBeingFailedOver(Database db) {
+assert (db != null);
+Map dbParameters = db.getParameters();
+if ((dbParameters != null) && 
(dbParameters.containsKey(ReplConst.REPL_FAILOVER_ENABLED))) {
+  return 
ReplConst.TRUE.equals(dbParameters.get(ReplConst.REPL_FAILOVER_ENABLED));
+}
+return false;

Review comment:
   Does this single line suffice?
   return dbParameters != null && 
ReplConst.TRUE.equals(dbParameters.get(ReplConst.REPL_FAILOVER_ENABLED));

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
##
@@ -229,6 +231,15 @@ public static boolean isExternalTable(Table table) {
 return isExternal(params);
   }
 
+  public static boolean isDbBeingFailedOver(Database db) {
+assert (db != null);
+Map dbParameters = db.getParameters();
+if ((dbParameters != null) && 
(dbParameters.containsKey(ReplConst.REPL_FAILOVER_ENABLED))) {
+  return 
ReplConst.TRUE.equals(dbParameters.get(ReplConst.REPL_FAILOVER_ENABLED));
+}
+return false;

Review comment:
   also, ReplConst.TRUE.equals : do we need to handle case sensitiveness? 

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -133,11 +135,21 @@ public void run() {
 LOG.info("Looking for tables using catalog: {} dbPattern: {} 
tablePattern: {} found: {}", catalogName,
   dbPattern, tablePattern, foundTableMetas.size());
 
+Map databasesToSkip = new HashMap<>();
+
 for (TableMeta tableMeta : foundTableMetas) {
   try {
+String dbName = 
MetaStoreUtils.prependCatalogToDbName(tableMeta.getCatName(), 
tableMeta.getDbName(), conf);
+if (!databasesToSkip.containsKey(dbName)) {
+  Database db = msc.getDatabase(tableMeta.getCatName(), 
tableMeta.getDbName());
+  databasesToSkip.put(dbName, isTargetOfReplication(db) || 
MetaStoreUtils.isDbBeingFailedOver(db));
+}
+if (databasesToSkip.get(dbName)) {
+  LOG.info("Skipping table : {}", tableMeta.getTableName());

Review comment:
   use debug.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -133,11 +135,21 @@ public void run() {
 LOG.info("Looking for tables using catalog: {} dbPattern: {} 
tablePattern: {} found: {}", catalogName,
   dbPattern, tablePattern, foundTableMetas.size());
 
+Map databasesToSkip = new HashMap<>();
+
 for (TableMeta tableMeta : foundTableMetas) {
   try {
+String dbName = 
MetaStoreUtils.prependCatalogToDbName(tableMeta.getCatName(), 
tableMeta.getDbName(), conf);
+if (!databasesToSkip.containsKey(dbName)) {
+  Database db = msc.getDatabase(tableMeta.getCatName(), 
tableMeta.getDbName());
+  databasesToSkip.put(dbName, isTargetOfReplication(db) || 
MetaStoreUtils.isDbBeingFailedOver(db));

Review comment:
   Add a INFO level log for DB that with why it is getting skipped...

##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -625,6 +633,11 @@ public boolean runOneWorkerIteration(
 }
 String cmd = null;
 try {
+  TableName tb = req.tableName;

Review comment:
   add a test where db being failed over is picked up first and later the 
other db is picked up 

##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -210,27 +213,34 @@ private void stopWorkers() {
 }
   }
 
-  private List processOneTable(TableName fullTableName)
+  private List 

[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=606333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-606333
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 04/Jun/21 01:51
Start Date: 04/Jun/21 01:51
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r645236553



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -210,27 +213,34 @@ private void stopWorkers() {
 }
   }
 
-  private List processOneTable(TableName fullTableName)
+  private List processOneTable(TableName fullTableName, 
Map dbsToSkip)
   throws MetaException, NoSuchTxnException, NoSuchObjectException {
 if (isAnalyzeTableInProgress(fullTableName)) return null;
 String cat = fullTableName.getCat(), db = fullTableName.getDb(), tbl = 
fullTableName.getTable();
+String dbName = MetaStoreUtils.prependCatalogToDbName(cat,db, conf);
+if (!dbsToSkip.containsKey(dbName)) {
+  Database database = rs.getDatabase(cat, db);
+  boolean skipDb = false;
+  if (MetaStoreUtils.isDbBeingFailedOver(database)) {
+skipDb = true;
+LOG.info("Skipping all the tables which belong to database: {} as it 
is being failed over", db);
+  } else if (ReplUtils.isTargetOfReplication(database)) {

Review comment:
   We already had two separate methods declared in both ReplUtils and 
PartitionManagementTask because ReplUtils package is not accessible in 
MetastoreThreads. But it can be moved to MetastoreUtils where it can be 
accessible by both of them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 606333)
Time Spent: 4h  (was: 3h 50m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=606133=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-606133
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 03/Jun/21 19:05
Start Date: 03/Jun/21 19:05
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r645057607



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -210,27 +213,34 @@ private void stopWorkers() {
 }
   }
 
-  private List processOneTable(TableName fullTableName)
+  private List processOneTable(TableName fullTableName, 
Map dbsToSkip)
   throws MetaException, NoSuchTxnException, NoSuchObjectException {
 if (isAnalyzeTableInProgress(fullTableName)) return null;
 String cat = fullTableName.getCat(), db = fullTableName.getDb(), tbl = 
fullTableName.getTable();
+String dbName = MetaStoreUtils.prependCatalogToDbName(cat,db, conf);
+if (!dbsToSkip.containsKey(dbName)) {
+  Database database = rs.getDatabase(cat, db);
+  boolean skipDb = false;
+  if (MetaStoreUtils.isDbBeingFailedOver(database)) {
+skipDb = true;
+LOG.info("Skipping all the tables which belong to database: {} as it 
is being failed over", db);
+  } else if (ReplUtils.isTargetOfReplication(database)) {

Review comment:
   There is lot of code duplication between this and PartitionManagement. 
Can we make not achieve by having a single copy?
   Also,  why do we have two methods for isTargetOfReplication(), can we have 
just one?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 606133)
Time Spent: 3h 50m  (was: 3h 40m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=605713=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-605713
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 03/Jun/21 07:15
Start Date: 03/Jun/21 07:15
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r644544768



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -625,6 +633,11 @@ public boolean runOneWorkerIteration(
 }
 String cmd = null;
 try {
+  TableName tb = req.tableName;

Review comment:
   add a test where db being failed over is picked up first and later the 
other db is picked up 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 605713)
Time Spent: 3h 40m  (was: 3.5h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=605710=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-605710
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 03/Jun/21 07:13
Start Date: 03/Jun/21 07:13
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r644543669



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -133,11 +135,21 @@ public void run() {
 LOG.info("Looking for tables using catalog: {} dbPattern: {} 
tablePattern: {} found: {}", catalogName,
   dbPattern, tablePattern, foundTableMetas.size());
 
+Map databasesToSkip = new HashMap<>();
+
 for (TableMeta tableMeta : foundTableMetas) {
   try {
+String dbName = 
MetaStoreUtils.prependCatalogToDbName(tableMeta.getCatName(), 
tableMeta.getDbName(), conf);
+if (!databasesToSkip.containsKey(dbName)) {
+  Database db = msc.getDatabase(tableMeta.getCatName(), 
tableMeta.getDbName());
+  databasesToSkip.put(dbName, isTargetOfReplication(db) || 
MetaStoreUtils.isDbBeingFailedOver(db));

Review comment:
   Add a INFO level log for DB that with why it is getting skipped...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 605710)
Time Spent: 3.5h  (was: 3h 20m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=605709=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-605709
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 03/Jun/21 07:12
Start Date: 03/Jun/21 07:12
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r644543165



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -133,11 +135,21 @@ public void run() {
 LOG.info("Looking for tables using catalog: {} dbPattern: {} 
tablePattern: {} found: {}", catalogName,
   dbPattern, tablePattern, foundTableMetas.size());
 
+Map databasesToSkip = new HashMap<>();
+
 for (TableMeta tableMeta : foundTableMetas) {
   try {
+String dbName = 
MetaStoreUtils.prependCatalogToDbName(tableMeta.getCatName(), 
tableMeta.getDbName(), conf);
+if (!databasesToSkip.containsKey(dbName)) {
+  Database db = msc.getDatabase(tableMeta.getCatName(), 
tableMeta.getDbName());
+  databasesToSkip.put(dbName, isTargetOfReplication(db) || 
MetaStoreUtils.isDbBeingFailedOver(db));
+}
+if (databasesToSkip.get(dbName)) {
+  LOG.info("Skipping table : {}", tableMeta.getTableName());

Review comment:
   use debug.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 605709)
Time Spent: 3h 20m  (was: 3h 10m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=605707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-605707
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 03/Jun/21 07:08
Start Date: 03/Jun/21 07:08
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r644540793



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
##
@@ -229,6 +231,15 @@ public static boolean isExternalTable(Table table) {
 return isExternal(params);
   }
 
+  public static boolean isDbBeingFailedOver(Database db) {
+assert (db != null);
+Map dbParameters = db.getParameters();
+if ((dbParameters != null) && 
(dbParameters.containsKey(ReplConst.REPL_FAILOVER_ENABLED))) {
+  return 
ReplConst.TRUE.equals(dbParameters.get(ReplConst.REPL_FAILOVER_ENABLED));
+}
+return false;

Review comment:
   also, ReplConst.TRUE.equals : do we need to handle case sensitiveness? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 605707)
Time Spent: 3h 10m  (was: 3h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=605705=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-605705
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 03/Jun/21 07:07
Start Date: 03/Jun/21 07:07
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r644540036



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
##
@@ -229,6 +231,15 @@ public static boolean isExternalTable(Table table) {
 return isExternal(params);
   }
 
+  public static boolean isDbBeingFailedOver(Database db) {
+assert (db != null);
+Map dbParameters = db.getParameters();
+if ((dbParameters != null) && 
(dbParameters.containsKey(ReplConst.REPL_FAILOVER_ENABLED))) {
+  return 
ReplConst.TRUE.equals(dbParameters.get(ReplConst.REPL_FAILOVER_ENABLED));
+}
+return false;

Review comment:
   Does this single line suffice?
   return dbParameters != null && 
ReplConst.TRUE.equals(dbParameters.get(ReplConst.REPL_FAILOVER_ENABLED));




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 605705)
Time Spent: 3h  (was: 2h 50m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=605698=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-605698
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 03/Jun/21 06:51
Start Date: 03/Jun/21 06:51
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r644531296



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -625,6 +633,11 @@ public boolean runOneWorkerIteration(
 }
 String cmd = null;
 try {
+  TableName tb = req.tableName;

Review comment:
   If the very first table belongs to DbBeingFailover, it will break the 
logic for "doWait"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 605698)
Time Spent: 2h 50m  (was: 2h 40m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=605053=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-605053
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 02/Jun/21 09:40
Start Date: 02/Jun/21 09:40
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r643786935



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -646,6 +668,10 @@ public boolean runOneWorkerIteration(
 return true;
   }
 
+  private synchronized boolean isDbPresentInFailoverSet(String dbName) {
+return dbsBeingFailedOver.containsKey(dbName) && 
dbsBeingFailedOver.get(dbName);

Review comment:
   This will not give the desired protection. The atomicity part is still 
vulnerable




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 605053)
Time Spent: 2h 40m  (was: 2.5h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604953=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604953
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 02/Jun/21 05:24
Start Date: 02/Jun/21 05:24
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r643661160



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -625,6 +639,16 @@ public boolean runOneWorkerIteration(
 }
 String cmd = null;
 try {
+  TableName tb = req.tableName;
+  String dbName = 
MetaStoreUtils.prependCatalogToDbName(tb.getCat(),tb.getDb(), conf);
+  if (dbsBeingFailedOver.contains(dbName)
+  || 
MetaStoreUtils.isDbBeingFailedOver(rs.getDatabase(tb.getCat(), tb.getDb( {
+if (!dbsBeingFailedOver.contains(dbName)) {

Review comment:
   you can simplify this. We don't need this check all the times




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604953)
Time Spent: 2.5h  (was: 2h 20m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604944
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 02/Jun/21 04:42
Start Date: 02/Jun/21 04:42
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r643647370



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -625,6 +639,16 @@ public boolean runOneWorkerIteration(
 }
 String cmd = null;
 try {
+  TableName tb = req.tableName;
+  String dbName = 
MetaStoreUtils.prependCatalogToDbName(tb.getCat(),tb.getDb(), conf);
+  if (dbsBeingFailedOver.contains(dbName)
+  || 
MetaStoreUtils.isDbBeingFailedOver(rs.getDatabase(tb.getCat(), tb.getDb( {
+if (!dbsBeingFailedOver.contains(dbName)) {

Review comment:
   if current dbName is not present in dbsBeingFailover set, then it'll add 
this db to it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604944)
Time Spent: 2h 20m  (was: 2h 10m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604938=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604938
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 02/Jun/21 04:20
Start Date: 02/Jun/21 04:20
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r643399198



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -625,6 +639,16 @@ public boolean runOneWorkerIteration(
 }
 String cmd = null;
 try {
+  TableName tb = req.tableName;
+  String dbName = 
MetaStoreUtils.prependCatalogToDbName(tb.getCat(),tb.getDb(), conf);
+  if (dbsBeingFailedOver.contains(dbName)
+  || 
MetaStoreUtils.isDbBeingFailedOver(rs.getDatabase(tb.getCat(), tb.getDb( {
+if (!dbsBeingFailedOver.contains(dbName)) {

Review comment:
   How will this condition be true ?

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -222,17 +237,31 @@ private void setupMsckPathInvalidation() {
 private Configuration conf;
 private String qualifiedTableName;
 private CountDownLatch countDownLatch;
+private Set dbsBeingFailedOver;
+private IMetaStoreClient msc;
 
-MsckThread(MsckInfo msckInfo, Configuration conf, String 
qualifiedTableName, CountDownLatch countDownLatch) {
+MsckThread(MsckInfo msckInfo, Configuration conf, String 
qualifiedTableName,
+   CountDownLatch countDownLatch, Set dbsBeingFailedOver, 
IMetaStoreClient msc) {
   this.msckInfo = msckInfo;
   this.conf = conf;
   this.qualifiedTableName = qualifiedTableName;
   this.countDownLatch = countDownLatch;
+  this.dbsBeingFailedOver = dbsBeingFailedOver;
+  this.msc = msc;
 }
 
 @Override
 public void run() {
   try {
+String dbName = 
MetaStoreUtils.prependCatalogToDbName(msckInfo.getCatalogName(), 
msckInfo.getDbName(), conf);
+if (dbsBeingFailedOver.contains(dbName) ||
+
MetaStoreUtils.isDbBeingFailedOver(msc.getDatabase(msckInfo.getCatalogName(), 
msckInfo.getDbName( {
+  if (!dbsBeingFailedOver.contains(dbName)) {

Review comment:
   This isn't thread-safe




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604938)
Time Spent: 2h 10m  (was: 2h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604872=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604872
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 02/Jun/21 01:25
Start Date: 02/Jun/21 01:25
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r643587387



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -94,6 +98,8 @@
   private BlockingQueue workQueue;
   private Thread[] workers;
 
+  private Set dbsBeingFailedOver;

Review comment:
   This is the only way we can access this set within each iteration of 
StatsUpdater and also within each execution of actual analysis work after 
dequeuing from the worker queue. This set is clear when new iteration of this 
thread kicks in.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604872)
Time Spent: 2h  (was: 1h 50m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604704
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 01/Jun/21 18:49
Start Date: 01/Jun/21 18:49
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r643398280



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -94,6 +98,8 @@
   private BlockingQueue workQueue;
   private Thread[] workers;
 
+  private Set dbsBeingFailedOver;

Review comment:
   Why do we need it at instance level?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604704)
Time Spent: 1h 50m  (was: 1h 40m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604361=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604361
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 01/Jun/21 08:03
Start Date: 01/Jun/21 08:03
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r642873937



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -214,23 +219,28 @@ private void stopWorkers() {
   throws MetaException, NoSuchTxnException, NoSuchObjectException {
 if (isAnalyzeTableInProgress(fullTableName)) return null;
 String cat = fullTableName.getCat(), db = fullTableName.getDb(), tbl = 
fullTableName.getTable();
+String dbName = MetaStoreUtils.prependCatalogToDbName(cat,db, conf);
+if (!isDbTargetOfReplication.containsKey(dbName) || 
!isDbBeingFailedOver.containsKey(dbName)) {
+  Database database = rs.getDatabase(cat, db);
+  isDbTargetOfReplication.put(dbName, 
ReplUtils.isTargetOfReplication(database));
+  isDbBeingFailedOver.put(dbName, 
MetaStoreUtils.isDbBeingFailedOver(database));

Review comment:
   Why do we need two separate maps, we don;t need the reason for skip, 
just tracking what to skip is fine no?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604361)
Time Spent: 1h 40m  (was: 1.5h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604358=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604358
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 01/Jun/21 07:54
Start Date: 01/Jun/21 07:54
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r642867437



##
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
##
@@ -84,6 +86,9 @@
   private ConcurrentHashMap partsInProgress = new 
ConcurrentHashMap<>();
   private AtomicInteger itemsInProgress = new AtomicInteger(0);
 
+  Map isDbTargetOfReplication = new HashMap<>();
+  Map isDbBeingFailedOver = new HashMap<>();

Review comment:
   Why do you need it at instance level?
   When is the map getting cleaned?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604358)
Time Spent: 1.5h  (was: 1h 20m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604322
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 01/Jun/21 05:47
Start Date: 01/Jun/21 05:47
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r642797415



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -137,7 +138,8 @@ public void run() {
   try {
 Table table = msc.getTable(tableMeta.getCatName(), 
tableMeta.getDbName(), tableMeta.getTableName());
 Database db = msc.getDatabase(table.getCatName(), 
table.getDbName());
-if (partitionDiscoveryEnabled(table.getParameters()) && 
!isTargetOfReplication(db)) {
+if (partitionDiscoveryEnabled(table.getParameters()) && 
!isTargetOfReplication(db)

Review comment:
   Changes already done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604322)
Time Spent: 1h 20m  (was: 1h 10m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604321=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604321
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 01/Jun/21 05:46
Start Date: 01/Jun/21 05:46
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r642797044



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -137,7 +138,8 @@ public void run() {
   try {
 Table table = msc.getTable(tableMeta.getCatName(), 
tableMeta.getDbName(), tableMeta.getTableName());
 Database db = msc.getDatabase(table.getCatName(), 
table.getDbName());
-if (partitionDiscoveryEnabled(table.getParameters()) && 
!isTargetOfReplication(db)) {
+if (partitionDiscoveryEnabled(table.getParameters()) && 
!isTargetOfReplication(db)

Review comment:
   For any of the tables of a db which is a target of replication, we may 
not need msc.getTable call




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604321)
Time Spent: 1h 10m  (was: 1h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=604317=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-604317
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 01/Jun/21 04:06
Start Date: 01/Jun/21 04:06
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r642767227



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -162,6 +164,10 @@ public void run() {
 setupMsckPathInvalidation();
 Configuration msckConf = Msck.getMsckConf(conf);
 for (Table table : candidateTables) {
+  if 
(MetaStoreUtils.isDbBeingFailedOver(msc.getDatabase(table.getCatName(), 
table.getDbName( {

Review comment:
   Cache can be short lived and we can rebuild it every time. Then also, we 
will have it at one call per db at max.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 604317)
Time Spent: 1h  (was: 50m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=603611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603611
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 28/May/21 16:28
Start Date: 28/May/21 16:28
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r641674237



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -162,6 +164,10 @@ public void run() {
 setupMsckPathInvalidation();
 Configuration msckConf = Msck.getMsckConf(conf);
 for (Table table : candidateTables) {
+  if 
(MetaStoreUtils.isDbBeingFailedOver(msc.getDatabase(table.getCatName(), 
table.getDbName( {

Review comment:
   Might not be doable. Say somehow we cached db failover property and 
after this repl.failover.enbaled prop is set to true for that db. Now, how will 
we figure out that cached data is outdated and needs to be fetched again?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603611)
Time Spent: 50m  (was: 40m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=603609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603609
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 28/May/21 16:25
Start Date: 28/May/21 16:25
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r641672545



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestPartitionManagement.java
##
@@ -620,6 +620,59 @@ public void testNoPartitionDiscoveryForReplTable() throws 
Exception {
 assertEquals(3, partitions.size());
   }
 
+  @Test
+  public void testNoPartitionDiscoveryForFailoverDb() throws Exception {
+String dbName = "db_failover";
+String tableName = "tbl_failover";
+Map colMap = buildAllColumns();
+List partKeys = Lists.newArrayList("state", "dt");
+List partKeyTypes = Lists.newArrayList("string", "date");
+List> partVals = Lists.newArrayList(
+Lists.newArrayList("__HIVE_DEFAULT_PARTITION__", "1990-01-01"),
+Lists.newArrayList("CA", "1986-04-28"),
+Lists.newArrayList("MN", "2018-11-31"));
+createMetadata(DEFAULT_CATALOG_NAME, dbName, tableName, partKeys, 
partKeyTypes, partVals, colMap, false);
+Table table = client.getTable(dbName, tableName);
+List partitions = client.listPartitions(dbName, tableName, 
(short) -1);
+assertEquals(3, partitions.size());
+String tableLocation = table.getSd().getLocation();
+URI location = URI.create(tableLocation);
+Path tablePath = new Path(location);
+FileSystem fs = FileSystem.get(location, conf);
+Path newPart1 = new Path(tablePath, "state=WA/dt=2018-12-01");
+Path newPart2 = new Path(tablePath, "state=UT/dt=2018-12-02");
+fs.mkdirs(newPart1);
+fs.mkdirs(newPart2);
+assertEquals(5, fs.listStatus(tablePath).length);
+partitions = client.listPartitions(dbName, tableName, (short) -1);
+assertEquals(3, partitions.size());
+
+// table property is set to true, but the table is marked as replication 
target. The new
+// partitions should not be created
+
table.getParameters().put(PartitionManagementTask.DISCOVER_PARTITIONS_TBLPROPERTY,
 "true");
+Database db = client.getDatabase(table.getDbName());
+db.putToParameters(ReplConst.REPL_FAILOVER_ENABLED, "true");
+client.alterDatabase(table.getDbName(), db);
+client.alter_table(dbName, tableName, table);

Review comment:
   Alter table will enable discover partition property for the table.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603609)
Time Spent: 40m  (was: 0.5h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=603605=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603605
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 28/May/21 16:19
Start Date: 28/May/21 16:19
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r641668822



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
##
@@ -228,6 +230,15 @@ public static boolean isExternalTable(Table table) {
 return isExternal(params);
   }
 
+  public static boolean isDbBeingFailedOver(Database db) {
+assert (db != null);
+Map dbParameters = db.getParameters();
+if ((dbParameters != null) && 
(dbParameters.containsKey(ReplConst.REPL_FAILOVER_ENABLED))) {
+  return 
!StringUtils.isEmpty(dbParameters.get(ReplConst.REPL_FAILOVER_ENABLED));

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603605)
Time Spent: 0.5h  (was: 20m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=603584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603584
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 28/May/21 15:51
Start Date: 28/May/21 15:51
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r640356534



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -99,6 +99,15 @@ public boolean isTargetOfReplication(Database db) {
 return false;
   }
 
+  public static boolean isBeingFailovedOver(Database db) {

Review comment:
   We can move this to some util class to avoid duplication

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -162,6 +164,10 @@ public void run() {
 setupMsckPathInvalidation();
 Configuration msckConf = Msck.getMsckConf(conf);
 for (Table table : candidateTables) {
+  if 
(MetaStoreUtils.isDbBeingFailedOver(msc.getDatabase(table.getCatName(), 
table.getDbName( {

Review comment:
   This is going to be costly. One HMS call per table. Can we maintain a 
cache?

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -99,6 +99,15 @@ public boolean isTargetOfReplication(Database db) {
 return false;
   }
 
+  public static boolean isBeingFailovedOver(Database db) {

Review comment:
   nit: Typo

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestPartitionManagement.java
##
@@ -659,6 +712,45 @@ public void testNoPartitionRetentionForReplTarget() throws 
TException, Interrupt
 assertEquals(3, partitions.size());
   }
 
+  @Test
+  public void testNoPartitionRetentionForFailoverDb() throws TException, 
InterruptedException {
+String dbName = "db_failover";
+String tableName = "tbl_failover";
+Map colMap = buildAllColumns();
+List partKeys = Lists.newArrayList("state", "dt");
+List partKeyTypes = Lists.newArrayList("string", "date");
+List> partVals = Lists.newArrayList(
+Lists.newArrayList("__HIVE_DEFAULT_PARTITION__", "1990-01-01"),
+Lists.newArrayList("CA", "1986-04-28"),
+Lists.newArrayList("MN", "2018-11-31"));
+// Check for the existence of partitions 10 seconds after the partition 
retention period has
+// elapsed. Gives enough time for the partition retention task to work.
+long partitionRetentionPeriodMs = 2;
+long waitingPeriodForTest = partitionRetentionPeriodMs + 10 * 1000;
+createMetadata(DEFAULT_CATALOG_NAME, dbName, tableName, partKeys, 
partKeyTypes, partVals, colMap, false);
+Table table = client.getTable(dbName, tableName);
+List partitions = client.listPartitions(dbName, tableName, 
(short) -1);
+assertEquals(3, partitions.size());
+
+
table.getParameters().put(PartitionManagementTask.DISCOVER_PARTITIONS_TBLPROPERTY,
 "true");
+
table.getParameters().put(PartitionManagementTask.PARTITION_RETENTION_PERIOD_TBLPROPERTY,
+partitionRetentionPeriodMs + "ms");
+client.alter_table(dbName, tableName, table);
+Database db = client.getDatabase(table.getDbName());
+db.putToParameters(ReplConst.REPL_FAILOVER_ENABLED, "true");

Review comment:
   May be we. can have two both cases covered, with and without 
ReplConst.REPL_FAILOVER_ENABLED in the same test.

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestPartitionManagement.java
##
@@ -620,6 +620,59 @@ public void testNoPartitionDiscoveryForReplTable() throws 
Exception {
 assertEquals(3, partitions.size());
   }
 
+  @Test
+  public void testNoPartitionDiscoveryForFailoverDb() throws Exception {
+String dbName = "db_failover";
+String tableName = "tbl_failover";
+Map colMap = buildAllColumns();
+List partKeys = Lists.newArrayList("state", "dt");
+List partKeyTypes = Lists.newArrayList("string", "date");
+List> partVals = Lists.newArrayList(
+Lists.newArrayList("__HIVE_DEFAULT_PARTITION__", "1990-01-01"),
+Lists.newArrayList("CA", "1986-04-28"),
+Lists.newArrayList("MN", "2018-11-31"));
+createMetadata(DEFAULT_CATALOG_NAME, dbName, tableName, partKeys, 
partKeyTypes, partVals, colMap, false);
+Table table = client.getTable(dbName, tableName);
+List partitions = client.listPartitions(dbName, tableName, 
(short) -1);
+assertEquals(3, partitions.size());
+String tableLocation = 

[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=600903=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600903
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 23/May/21 00:50
Start Date: 23/May/21 00:50
Worklog Time Spent: 10m 
  Work Description: hmangla98 opened a new pull request #2311:
URL: https://github.com/apache/hive/pull/2311


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600903)
Remaining Estimate: 0h
Time Spent: 10m

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)