[
https://issues.apache.org/jira/browse/HIVE-20190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
BELUGA BEHR updated HIVE-20190:
-------------------------------
Description:
https://github.com/apache/hive/blob/e7d1781ec4662e088dcd6ffbe3f866738792ad9b/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L320
There are times when a misbehaving client can knock a HS2 instance offline
because it opens many simultaneous connections and takes up all of the
resources. It would be nice if we could log the source IP address of each
connection along with the "Client protocol version" information. In this way
it is much easier to pinpoint the problematic client. Extra credit for
kerberos principal name as well.
The current logging of a client connecting is something like:
{code}
2018-07-16 09:40:44,939 INFO
org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool:
Thread-290000]: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V7
2018-07-16 09:40:44,941 INFO hive.metastore: [HiveServer2-Handler-Pool:
Thread-290000]: Trying to connect to metastore with URI thrift://host:9083
2018-07-16 09:40:44,942 INFO hive.metastore: [HiveServer2-Handler-Pool:
Thread-290000]: Opened a connection to metastore, current connections: 40
2018-07-16 09:40:44,943 INFO hive.metastore: [HiveServer2-Handler-Pool:
Thread-290000]: Connected to metastore.
2018-07-16 09:40:44,950 INFO org.apache.hadoop.hive.ql.session.SessionState:
[HiveServer2-Handler-Pool: Thread-290000]: Created local directory:
/tmp/d88e17d3-ac42-4de5-8043-9a9e2097ef8d_resources
2018-07-16 09:40:44,953 INFO org.apache.hadoop.hive.ql.session.SessionState:
[HiveServer2-Handler-Pool: Thread-290000]: Created HDFS directory:
/tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
2018-07-16 09:40:44,954 INFO org.apache.hadoop.hive.ql.session.SessionState:
[HiveServer2-Handler-Pool: Thread-290000]: Created local directory:
/tmp/hive/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
2018-07-16 09:40:44,957 INFO org.apache.hadoop.hive.ql.session.SessionState:
[HiveServer2-Handler-Pool: Thread-290000]: Created HDFS directory:
/tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d/_tmp_space.db
2018-07-16 09:40:44,958 INFO org.apache.hadoop.hive.ql.session.SessionState:
[HiveServer2-Handler-Pool: Thread-290000]: No Tez session required at this
point. hive.execution.engine=mr.
2018-07-16 09:40:44,958 INFO
org.apache.hive.service.cli.session.HiveSessionImpl: [HiveServer2-Handler-Pool:
Thread-290000]: Operation log session directory is created:
/tmp/hive/operation_logs/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
2018-07-16 09:40:44,959 INFO
org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool:
Thread-290000]: Opened a session, current sessions: 883
{code}
was:
https://github.com/apache/hive/blob/e7d1781ec4662e088dcd6ffbe3f866738792ad9b/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L320
There are times when a misbehaving client can knock a HS2 instance offline
because it opens many simultaneous connections and takes up all of the
resources. It would be nice if we could log the source IP address of each
connection along with the "Client protocol version" information. In this way
it is much easier to pinpoint the problematic client. Extra credit for
kerberos principal name as well.
> Report Client IP Address When Opening New Session
> -------------------------------------------------
>
> Key: HIVE-20190
> URL: https://issues.apache.org/jira/browse/HIVE-20190
> Project: Hive
> Issue Type: Improvement
> Components: HiveServer2
> Affects Versions: 3.0.0, 2.3.2, 4.0.0
> Reporter: BELUGA BEHR
> Priority: Major
>
> https://github.com/apache/hive/blob/e7d1781ec4662e088dcd6ffbe3f866738792ad9b/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L320
> There are times when a misbehaving client can knock a HS2 instance offline
> because it opens many simultaneous connections and takes up all of the
> resources. It would be nice if we could log the source IP address of each
> connection along with the "Client protocol version" information. In this way
> it is much easier to pinpoint the problematic client. Extra credit for
> kerberos principal name as well.
> The current logging of a client connecting is something like:
> {code}
> 2018-07-16 09:40:44,939 INFO
> org.apache.hive.service.cli.thrift.ThriftCLIService:
> [HiveServer2-Handler-Pool: Thread-290000]: Client protocol version:
> HIVE_CLI_SERVICE_PROTOCOL_V7
> 2018-07-16 09:40:44,941 INFO hive.metastore: [HiveServer2-Handler-Pool:
> Thread-290000]: Trying to connect to metastore with URI thrift://host:9083
> 2018-07-16 09:40:44,942 INFO hive.metastore: [HiveServer2-Handler-Pool:
> Thread-290000]: Opened a connection to metastore, current connections: 40
> 2018-07-16 09:40:44,943 INFO hive.metastore: [HiveServer2-Handler-Pool:
> Thread-290000]: Connected to metastore.
> 2018-07-16 09:40:44,950 INFO
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool:
> Thread-290000]: Created local directory:
> /tmp/d88e17d3-ac42-4de5-8043-9a9e2097ef8d_resources
> 2018-07-16 09:40:44,953 INFO
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool:
> Thread-290000]: Created HDFS directory:
> /tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
> 2018-07-16 09:40:44,954 INFO
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool:
> Thread-290000]: Created local directory:
> /tmp/hive/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
> 2018-07-16 09:40:44,957 INFO
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool:
> Thread-290000]: Created HDFS directory:
> /tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d/_tmp_space.db
> 2018-07-16 09:40:44,958 INFO
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool:
> Thread-290000]: No Tez session required at this point.
> hive.execution.engine=mr.
> 2018-07-16 09:40:44,958 INFO
> org.apache.hive.service.cli.session.HiveSessionImpl:
> [HiveServer2-Handler-Pool: Thread-290000]: Operation log session directory is
> created: /tmp/hive/operation_logs/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
> 2018-07-16 09:40:44,959 INFO
> org.apache.hive.service.cli.thrift.ThriftCLIService:
> [HiveServer2-Handler-Pool: Thread-290000]: Opened a session, current
> sessions: 883
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)