----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72815/#review221744 -----------------------------------------------------------
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 103 (patched) <https://reviews.apache.org/r/72815/#comment310794> Delete database and table entities in Atlas if not present in Hive addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 843 (patched) <https://reviews.apache.org/r/72815/#comment310799> getAllDatabaseInCluster(): consider retrieving databases where clusterName == atlas.metdata.namespace, to ensure that databases and tables that belong to another metadata namespace are not deleted. addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 846 (patched) <https://reviews.apache.org/r/72815/#comment310796> Consider reading limit (pageSize) from configuration. addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 850 (patched) <https://reviews.apache.org/r/72815/#comment310795> Consider following for better readability: final List<AtlasEntityHeader> entities = new ArrayList<>(); final int pageSize = 10000; for (int i = 0; ; i++) { int offset = pageSize * i; LOG.info("retrieving databases: offset={}, pageSize={}", offset, pageSize); AtlasSearchResult searchResult = atlasClientV2.basicSearch(HIVE_TYPE_DB, null, null, true, pageSize, offset); List<AtlasEntityHeader> entityHeaders = searchResult == null ? null : searchResult.getEntities(); int dbCount = entityHeaders == null ? 0 : entityHeaders.size(); LOG.info("retrieved {} databases", dbCount); if (dbCount > 0) { entities.addAll(entityHeaders); } if (dbCount < pageSize) { // last page break; } } addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 872 (patched) <https://reviews.apache.org/r/72815/#comment310797> Consider updating getAllTablesInDb() per comment above for getAllDatabaseInCluster(). addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 939 (patched) <https://reviews.apache.org/r/72815/#comment310798> Move #939 inside else block at #936. addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 966 (patched) <https://reviews.apache.org/r/72815/#comment310800> db.getAttribute(ATTRIBUTE_QUALIFIED_NAME).toString() => (String) db.getAttribute(ATTRIBUTE_QUALIFIED_NAME) - this will handle NULL value from db.getAttribute(ATTRIBUTE_QUALIFIED_NAME) addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 969 (patched) <https://reviews.apache.org/r/72815/#comment310801> Include guid in the log message addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 978 (patched) <https://reviews.apache.org/r/72815/#comment310802> Failed to retrieve table entities for database {} from Atlas. addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java Lines 985 (patched) <https://reviews.apache.org/r/72815/#comment310803> Queue<String> guidsToDelete = new LinkedList<>(); if (!hiveClient.databaseExists(hiveDbName)) { if (CollectionUtils.isNotEmpty(tables)) { for (AtlasEntityHeader table : tables) { guidsToDelete.add(table.getGuid()); } } guidsToDelete.add(db.getGuid()); } else { if (CollectionUtils.isNotEmpty(tables)) { for (AtlasEntityHeader table : tables) { String hiveTableName = getHiveTableName((String) table.getAttribute(ATTRIBUTE_QUALIFIED_NAME), true); if (StringUtils.isEmpty(hiveTableName)) { LOG.error("Cannot get table from qualifiedName {} ", (String) table.getAttribute(ATTRIBUTE_QUALIFIED_NAME)); continue; } try { hiveClient.getTable(hiveDbName, hiveTableName, true); } catch (InvalidTableException e) { // table doesn't exist LOG.info("Added table {}.{} to delete", hiveDbName, hiveTableName); guidsToDelete.add(table.getGuid()); } catch (HiveException e) { LOG.error("Failed to get table {}.{} from Hive", hiveDbName, hiveTableName, e); if(failOnError) { throw e; } } } } } if (CollectionUtils.isNotEmpty(guidsToDelete)) { try { deleteByGuid(guidsToDelete); } catch (AtlasServiceException e) { LOG.error("Failed to delete Atlas entities for database {} , {}", hiveDbName, e); if(failOnError) { throw e; } } } - Madhan Neethiraj On Aug. 28, 2020, 1:33 p.m., Pinal Shah wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72815/ > ----------------------------------------------------------- > > (Updated Aug. 28, 2020, 1:33 p.m.) > > > Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, > Nixon Rodrigues, and Sarath Subramanian. > > > Repository: atlas > > > Description > ------- > > **Problem:** Whenever database or table is dropped in hive, and HiveHook is > not enabled, we dont have anyway to get the database and table sync with hive. > > **Workaround:** Added support to delete hive entities in Atlas which are > dropped in hive. > > **Usage:** ./import-hive.sh -deleteNonExisting > > > Diffs > ----- > > > addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java > e659ca041 > client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java > 6968e8358 > > > Diff: https://reviews.apache.org/r/72815/diff/1/ > > > Testing > ------- > > Manually tested > > > Thanks, > > Pinal Shah > >