Re: Review Request 72815: ENGESC-3520: Deletion of non existing hive entities

2020-08-28 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/#review221744
---




addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 103 (patched)


Delete database and table entities in Atlas if not present in Hive



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 843 (patched)


getAllDatabaseInCluster(): consider retrieving databases where clusterName 
== atlas.metdata.namespace, to ensure that databases and tables that belong to 
another metadata namespace are not deleted.



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 846 (patched)


Consider reading limit (pageSize) from configuration.



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 850 (patched)


Consider following for better readability:

  final List entities = new ArrayList<>();
  final int pageSize = 1;

  for (int i = 0; ; i++) {
int offset = pageSize * i;

LOG.info("retrieving databases: offset={}, pageSize={}", offset, 
pageSize);

AtlasSearchResult   searchResult  = 
atlasClientV2.basicSearch(HIVE_TYPE_DB, null, null, true, pageSize, offset);
List entityHeaders = searchResult == null ? null : 
searchResult.getEntities();
int dbCount   = entityHeaders == null ? 0 : 
entityHeaders.size();

LOG.info("retrieved {} databases", dbCount);

if (dbCount > 0) {
  entities.addAll(entityHeaders);
}

if (dbCount < pageSize) { // last page
  break;
}
  }



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 872 (patched)


Consider updating getAllTablesInDb() per comment above for 
getAllDatabaseInCluster().



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 939 (patched)


Move #939 inside else block at #936.



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 966 (patched)


db.getAttribute(ATTRIBUTE_QUALIFIED_NAME).toString() => (String) 
db.getAttribute(ATTRIBUTE_QUALIFIED_NAME)
  - this will handle NULL value from 
db.getAttribute(ATTRIBUTE_QUALIFIED_NAME)



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 969 (patched)


Include guid in the log message



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 978 (patched)


Failed to retrieve table entities for database {} from Atlas.



addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 985 (patched)


Queue guidsToDelete = new LinkedList<>();
  
  if (!hiveClient.databaseExists(hiveDbName)) {
if (CollectionUtils.isNotEmpty(tables)) {
  for (AtlasEntityHeader table : tables) {
guidsToDelete.add(table.getGuid());
  }
}

guidsToDelete.add(db.getGuid());
  } else {
if (CollectionUtils.isNotEmpty(tables)) {
  for (AtlasEntityHeader table : tables) {
String hiveTableName = getHiveTableName((String) 
table.getAttribute(ATTRIBUTE_QUALIFIED_NAME), true);

if (StringUtils.isEmpty(hiveTableName)) {
  LOG.error("Cannot get table from qualifiedName {} ", (String) 
table.getAttribute(ATTRIBUTE_QUALIFIED_NAME));

  continue;
}

try {
  hiveClient.getTable(hiveDbName, hiveTableName, true);
} catch (InvalidTableException e) {  // table doesn't exist
  LOG.info("Added table {}.{} to delete", hiveDbName, 
hiveTableName);

  guidsToDelete.add(table.getGuid());
} catch (HiveException e) {
  LOG.error("Failed to get table {}.{} from Hive", hiveDbName, 
hiveTableName, e);

  if(failOnError) {
throw e;
  }
}
  }
}
  }

  if (CollectionUtils.isNotEmpty(guidsToDelete)) {
try {
  deleteByGuid(guidsToDelete);
} catch (AtlasServiceException e) {
  

Review Request 72815: ENGESC-3520: Deletion of non existing hive entities

2020-08-28 Thread Pinal Shah

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/
---

Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Repository: atlas


Description
---

**Problem:** Whenever database or table is dropped in hive, and HiveHook is not 
enabled, we dont have anyway to get the database and table sync with hive.

**Workaround:** Added support to delete hive entities in Atlas which are 
dropped in hive.

**Usage:** ./import-hive.sh -deleteNonExisting


Diffs
-

  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
 e659ca041 
  client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358 


Diff: https://reviews.apache.org/r/72815/diff/1/


Testing
---

Manually tested


Thanks,

Pinal Shah