I think the primary motivation for cache was the cloud use-case where the
default SQL instance is underpowered and you have to set up and pay more
to get a performant one. The same may apply to setting up redis/memcached
on the cloud environment - if you set one up you might as well just get
the faster SQL instance.
On prem any reasonable RDBMS is usually fast enough that caching metadata
may not provide much benefit.
Also, I’m not familiar with DN caching but Hive actually does not use DN
ORM itself to retrieve most of the performance sensitive stuff (like
partitions and stats), because the initial retrieval of many entities is
very inefficient; it issues SQL queries directly. I’m not sure if it would
be easy to integrate with DN caching.

Thejas might have more details on the design.

On 18/2/23, 10:49, "Alexander Kolbasov" <ak...@cloudera.com> wrote:

>I am wondering about the design choices made for ObjectStore cache. Looks
>like Datanucleus has support for L2 caching using various backends,
>including memcache and redis (caching support is pluggable). I am
>why you decided to implement your own caching solution instead. Even if
>wanted to cache at thrift level, using memcached or redis seems like a
>useful thing to consider. Were there any reasons to avoid these?
>- Alex

Reply via email to