[
https://issues.apache.org/jira/browse/HIVE-21596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vihang Karajgaonkar updated HIVE-21596:
---------------------------------------
Description:
{{HiveMetastoreClient}} currently depends on the fact that both the client and
server versions are the same. Additionally, since the server APIs are backwards
compatible, it is possible for a older client (eg. 2.1.0 client version) to
connect to a newer server (eg. 3.1.0 server version) without any issues. This
is useful in setups where HMS is deployed in a remote mode and clients connect
to it remotely.
It would be a good improvement if a newer version {{HiveMetastoreClient }} can
connect to the a older server version. When a newer client is talking to a
older server following things can happen:
1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which
should be automatically be handled by the clients since each API already throws
TException.
2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize
the field in the first place since it does not know about that field id. So the
wire-compatibility exists already. However, the client side application should
understand the implications of such a behavior. In such cases, it would be
better for the client to throw exception by checking the server version which
was added in HIVE-21484
3. If the newer client has re-implemented a certain API, for example, using
newer thrift API the client will start seeing exception {{Invalid method name}}
since the older server does not have such a method.
This can be handled on the client side by making sure that the newer
implementation is conditional to the server version. Which means client should
check the server version and invoke the new implementation only if the server
version supports the newer API. (On a side note, it would be great if metastore
also gives information of which APIs are supported for a given version)
One of the real world use-case of such a feature is in Impala which wants to
have capability to talk to both HMS 2.x and HMS 3.x. But other applications
like Spark (or third party applications which want to support multiple HMS
versions) may also find this useful.
Also, this patch will do a best effort to fix all such cases between Hive 2.3.0
and newer versions of HMS. It should be a on-going effort to be exhaustive. We
will also need to add support for this in our test infrastructure to spin up
older HMS server versions and test using newer clients APIs. I will create a
separate sub-task for that since it may need more plumbing in ptest.
was:
{{HiveMetastoreClient}} currently depends on the fact that both the client and
server versions are the same. Additionally, since the server APIs are backwards
compatible, it is possible for a older client (eg. 2.1.0 client version) to
connect to a newer server (eg. 3.1.0 server version) without any issues. This
is useful in setups where HMS is deployed in a remote mode and clients connect
to it remotely.
It would be a good improvement if a newer version {{HiveMetastoreClient }} can
connect to the a older server version. When a newer client is talking to a
older server following things can happen:
1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which
should be automatically be handled by the clients since each API already throws
TException.
2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize
the field in the first place since it does not know about that field id. So the
wire-compatibility exists already. However, the client side application should
understand the implications of such a behavior. In such cases, it would be
better for the client to throw exception by checking the server version which
was added in HIVE-21484
3. If the newer client has re-implemented a certain API, for example, using a
newer more efficient thrift API, but an older thrift API also exists which can
provide the same functionality. In this case, the new client will start seeing
exception {{Invalid method name}} since the older server does not have such a
method. This can be handled on the client side by making sure that the newer
implementation is conditional to the server version, and falling back to the
older (maybe less-efficient) one when necessary. Which means client should
check the server version and invoke the new implementation only if the server
version supports the newer API. (On a side note, it would be great if metastore
also gives information of which APIs are supported for a given version)
One of the real world use-case of such a feature is in Impala which wants to
have capability to talk to both HMS 2.x and HMS 3.x. But other applications
like Spark (or third party applications which want to support multiple HMS
versions) may also find this useful.
> HiveMetastoreClient should be able to connect to older metastore servers
> ------------------------------------------------------------------------
>
> Key: HIVE-21596
> URL: https://issues.apache.org/jira/browse/HIVE-21596
> Project: Hive
> Issue Type: Improvement
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Priority: Major
>
> {{HiveMetastoreClient}} currently depends on the fact that both the client
> and server versions are the same. Additionally, since the server APIs are
> backwards compatible, it is possible for a older client (eg. 2.1.0 client
> version) to connect to a newer server (eg. 3.1.0 server version) without any
> issues. This is useful in setups where HMS is deployed in a remote mode and
> clients connect to it remotely.
> It would be a good improvement if a newer version {{HiveMetastoreClient }}
> can connect to the a older server version. When a newer client is talking to
> a older server following things can happen:
> 1. Client invokes a RPC to the older server which doesn't exist.
> In such a case, thrift will throw {{Invalid method name}} exception which
> should be automatically be handled by the clients since each API already
> throws TException.
> 2. Client invokes a RPC using thrift objects which has new fields added.
> When a new field is added to a thrift object, the server does not deserialize
> the field in the first place since it does not know about that field id. So
> the wire-compatibility exists already. However, the client side application
> should understand the implications of such a behavior. In such cases, it
> would be better for the client to throw exception by checking the server
> version which was added in HIVE-21484
> 3. If the newer client has re-implemented a certain API, for example, using
> newer thrift API the client will start seeing exception {{Invalid method
> name}} since the older server does not have such a method.
> This can be handled on the client side by making sure that the newer
> implementation is conditional to the server version. Which means client
> should check the server version and invoke the new implementation only if the
> server version supports the newer API. (On a side note, it would be great if
> metastore also gives information of which APIs are supported for a given
> version)
> One of the real world use-case of such a feature is in Impala which wants to
> have capability to talk to both HMS 2.x and HMS 3.x. But other applications
> like Spark (or third party applications which want to support multiple HMS
> versions) may also find this useful.
> Also, this patch will do a best effort to fix all such cases between Hive
> 2.3.0 and newer versions of HMS. It should be a on-going effort to be
> exhaustive. We will also need to add support for this in our test
> infrastructure to spin up older HMS server versions and test using newer
> clients APIs. I will create a separate sub-task for that since it may need
> more plumbing in ptest.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)