Apologies if this has been discussed in the past - my searches did not pull up any relevant threads. If there are better solutions available out of the box, please let me know!
Problem statement -------------------------- We have a setup where a single metastoredb is used by Hive, Presto and SparkSQL. In addition, there are 1000s of hive queries submitted in batch form from multiple machines. Oftentimes, the metastoredb ends up being remote (in a different region in AWS etc) and round-trip latency is high. We've seen single thrift calls getting translated into lots of small SQL calls by datanucleus and the roundtrip latency ends up killing performance. Furthermore, any of these systems may create / modify a hive table and this should be reflected in the other system. Example, I may create a table in hive and query it using Presto or vice versa. In our setup, there may be multiple thrift metastore servers pointing to the same metastore db. Investigation ------------------- Basically, we've been looking at caching to solve this problem (will come to invalidation in a bit). I looked briefly at DN's support for caching - these two parameters seem to be switched off by default. METASTORE_CACHE_LEVEL2("datanucleus.cache.level2", false), METASTORE_CACHE_LEVEL2_TYPE("datanucleus.cache.level2.type", "none"), Furthermore, my reading of http://www.datanucleus.org/products/datanucleus/jdo/cache.html suggests that there is no sophistication in invalidation - seems like only time-based invalidation is supported and it can't work across multiple PMFs (therefore, multiple thrift metastore servers) Solution Outline ----------------------- - Every table / partition will have an additional property called 'version' - Any call that modifies table or partition will bump up version of the table / partition - Guava based cache of thrift objects that come from metastore calls - We fire a single SQL matching versions before returning from cache - It is conceivable to have a mode wherein invalidation based on version happens in a background thread (for higher performance, lower fidelity) - Not proposing any locking (not shooting for world peace here :) ) - We could extend HiveMetaStore class or create a new server altogether Is this something that would be interesting to the community? Is this problem already solved and should I spend my time watching GoT instead? Thanks Siva