[
https://issues.apache.org/jira/browse/IMPALA-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971347#comment-16971347
]
Vihang Karajgaonkar commented on IMPALA-9139:
---------------------------------------------
Ah, I missed that. You are right. I will close this JIRA as not a bug.
> Invalidate metadata adds all the tables to background loading pool
> unnecessarily
> --------------------------------------------------------------------------------
>
> Key: IMPALA-9139
> URL: https://issues.apache.org/jira/browse/IMPALA-9139
> Project: IMPALA
> Issue Type: Bug
> Reporter: Vihang Karajgaonkar
> Priority: Major
>
> I see the following code in the reset() method of CatalogServiceCatalog
> {code:java}
> // Build a new DB cache, populate it, and replace the existing cache in
> one
> // step.
> Map<String, Db> newDbCache = new ConcurrentHashMap<String, Db>();
> List<TTableName> tblsToBackgroundLoad = new ArrayList<>();
> try (MetaStoreClient msClient = getMetaStoreClient()) {
> List<String> allDbs = msClient.getHiveClient().getAllDatabases();
> int numComplete = 0;
> for (String dbName: allDbs) {
> if (isBlacklistedDb(dbName)) {
> LOG.info("skip blacklisted db: " + dbName);
> continue;
> }
> String annotation = String.format("invalidating metadata - %s/%s
> dbs complete",
> numComplete++, allDbs.size());
> try (ThreadNameAnnotator tna = new ThreadNameAnnotator(annotation))
> {
> dbName = dbName.toLowerCase();
> Db oldDb = oldDbCache.get(dbName);
> Pair<Db, List<TTableName>> invalidatedDb = invalidateDb(msClient,
> dbName, oldDb);
> if (invalidatedDb == null) continue;
> newDbCache.put(dbName, invalidatedDb.first);
> tblsToBackgroundLoad.addAll(invalidatedDb.second);
> }
> }
> }
> dbCache_.set(newDbCache);
> // Identify any deleted databases and add them to the delta log.
> Set<String> oldDbNames = oldDbCache.keySet();
> Set<String> newDbNames = newDbCache.keySet();
> oldDbNames.removeAll(newDbNames);
> for (String dbName: oldDbNames) {
> Db removedDb = oldDbCache.get(dbName);
> updateDeleteLog(removedDb);
> }
> // Submit tables for background loading.
> for (TTableName tblName: tblsToBackgroundLoad) {
> tableLoadingMgr_.backgroundLoad(tblName);
> }
> {code}
> If you notice above, the tables are being added to the backgroundLoad with
> checking the flag {{loadInBackground_}}. This means that even if the flag is
> unset, after we issue a invalidate metadata command, all the tables in the
> system are being loaded in the background. Note that this code is only
> loading the tables, not adding the loaded tables to the catalog which is good
> otherwise the memory footprint of catalog would be increased after every
> invalidate metadata command.
> This bug has 2 implications:
> 1. We are obviously wasting a lot of cpu cycles without getting anything out
> of it.
> 2. The more subtle side-effect is that this would fill up the
> {{tableLoadingDeque_}}. This means any other background load task will take a
> longer duration to complete.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]