I think the primary motivation for cache was the cloud use-case where the default SQL instance is underpowered and you have to set up and pay more to get a performant one. The same may apply to setting up redis/memcached on the cloud environment - if you set one up you might as well just get the faster SQL instance. On prem any reasonable RDBMS is usually fast enough that caching metadata may not provide much benefit. Also, I’m not familiar with DN caching but Hive actually does not use DN ORM itself to retrieve most of the performance sensitive stuff (like partitions and stats), because the initial retrieval of many entities is very inefficient; it issues SQL queries directly. I’m not sure if it would be easy to integrate with DN caching.
Thejas might have more details on the design. On 18/2/23, 10:49, "Alexander Kolbasov" <ak...@cloudera.com> wrote: >Hello, > >I am wondering about the design choices made for ObjectStore cache. Looks >like Datanucleus has support for L2 caching using various backends, >including memcache and redis (caching support is pluggable). I am >wondering >why you decided to implement your own caching solution instead. Even if >you >wanted to cache at thrift level, using memcached or redis seems like a >useful thing to consider. Were there any reasons to avoid these? > >- Alex