How much benefit is that massive cache giving you? If it’s not giving much benefit, maybe we should not be caching so much. Maybe in your case caching just the current schema (or the most recent 2 or 3 schemas) would be a better strategy.
As y’all know, I’m always in favor of removing caches. Or at least getting them to prove their worth. Julian > On Oct 28, 2019, at 11:50 PM, Feng Zhu <[email protected]> wrote: > > Thanks, I will open a JIRA for discussion, with design doc and testing > report. > > Danny Chan <[email protected]> 于2019年10月29日周二 下午12:11写道: > >> Sounds very attractive, could you give an intuitive design doc to >> illustrate how it works ? And we may review the design then ;) >> >> Best, >> Danny Chan >> 在 2019年10月29日 +0800 AM10:36,Feng Zhu <[email protected]>,写道: >>> Hi all, >>> We made some optimizations in practice. But I'm not sure whether this >> kind >>> of change is necessary to the community, because it will make the code >>> complex. >>> >>> Current now, JdbcSchema caches all JdbcTables in tableMap (i.e.,* >>> ImmutableMap<String, JdbcTable> tableMap*) >>> >>> In our production environment, there are about 3000+ datasources and >>> correspondingly creating 3000+ JdbcSchemas, while each JdbcSchema may >>> contain up to 10000+ tables.Consequently, the table map occupies nearly >>> 10GB memory, bringing great pressure on the server. >>> >>> We encode <*catalogName, schemaName, tableTypeName*> tuple as unique >>> Integer, and simplify the table map as <*String, Integer*>. According to >>> the Integer, we can find tuple and construct JdbcTable dynamically. >> Benefit >>> from this, the cached table map costs only about 800MB memory. >>> >>> Best, >>> DonnyZone >>
