[
https://issues.apache.org/jira/browse/SENTRY-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933633#comment-15933633
]
Lei (Eddy) Xu edited comment on SENTRY-1630 at 3/20/17 9:42 PM:
----------------------------------------------------------------
It looks like that {{SentryService}} keeps restarting and creating new
connections to HMS.
{noformat}
2017-03-17 18:25:44,746 INFO hive.metastore: Trying to connect to metastore
with URI thrift://node-1.network1978:9083
2017-03-17 18:25:44,764 INFO hive.metastore: Opened a connection to metastore,
current connections: 1
2017-03-17 18:25:44,807 INFO hive.metastore: Connected to metastore.
2017-03-17 18:25:44,807 INFO org.apache.sentry.service.thrift.HMSFollower: Non
secure connection established with HMS
2017-03-17 18:25:44,807 INFO org.apache.sentry.service.thrift.HMSFollower:
HMSFollower of Sentry successfully connected to HMS
2017-03-17 18:25:44,823 INFO org.apache.sentry.service.thrift.HMSFollower:
Before fetching hive full snapshot, Current NotificationID = 995.
2017-03-17 18:25:46,075 ERROR org.apache.sentry.hdfs.FullUpdateInitializer:
Task did not complete successfully after 0 tries. Exception got:
org.apache.thrift.TApplicationException: get_partition_names failed: out of
sequence response
2017-03-17 18:25:46,076 ERROR org.apache.sentry.hdfs.FullUpdateInitializer:
Task did not complete successfully after 0 tries. Exception got:
org.apache.thrift.TApplicationException: get_partitions_by_names failed: out of
sequence response
2017-03-17 18:25:46,834 ERROR org.apache.sentry.service.thrift.HMSFollower:
Exception occurred persisting Hive full snapshot into DB
java.lang.RuntimeException: org.apache.thrift.TApplicationException:
get_partition_names failed: out of sequence response
at
org.apache.sentry.hdfs.FullUpdateInitializer.createInitialUpdate(FullUpdateInitializer.java:326)
at
org.apache.sentry.service.thrift.HMSFollower.fetchFullUpdate(HMSFollower.java:317)
at
org.apache.sentry.service.thrift.HMSFollower.run(HMSFollower.java:246)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.TApplicationException: get_partition_names failed:
out of sequence response
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition_names(ThriftHiveMetastore.java:2164)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition_names(ThriftHiveMetastore.java:2149)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionNames(HiveMetaStoreClient.java:1376)
at
org.apache.sentry.hdfs.FullUpdateInitializer$TableTask.doTask(FullUpdateInitializer.java:227)
at
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask$RetryStrategy.exec(FullUpdateInitializer.java:112)
at
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:152)
at
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
... 3 more
2017-03-17 18:25:49,364 INFO org.apache.sentry.SentryMain: Configuring log4j to
use
[/var/run/cloudera-scm-agent/process/149-sentry-SENTRY_SERVER/sentry-log4j.properties]
2017-03-17 18:25:49,645 INFO org.apache.sentry.service.thrift.SentryService:
Configured on address node-1.network1978/192.168.125.2:8038
2017-03-17 18:25:49,657 INFO
org.apache.sentry.service.thrift.LeaderStatusMonitor: Leader election protocol
disabled, assuming single active server
2017-03-17 18:25:49,658 INFO org.apache.sentry.service.thrift.SentryService:
Attempting to start...
2017-03-17 18:25:49,659 INFO org.apache.sentry.service.thrift.SentryService:
HMSFollower is being configured
2017-03-17 18:25:49,667 INFO org.apache.sentry.service.thrift.HMSFollower:
HMSFollower is being initialized
2017-03-17 18:25:50,108 INFO DataNucleus.Persistence: Property
datanucleus.cache.level2 unknown - will be ignored
{noformat}
[~kkalyan] Is there a way that I verify the version running here includes
SENTRY-1628? The error message here does not match the error message after
SENTRY-1628. For example, there after SENTRY-1628, there is a piece of error
handling here in {{HMSFollower#run()}}, where I believe that the
{{RuntimeException}} thrown from
{{FullUpdateInitializer.createInitialUpdate()}} should be handled.
{code}
} catch (Throwable t) {
// catching errors to prevent the executor to halt.
LOGGER.error("Caught unexpected exception in HMSFollower!", t.getCause());
}
{code}
was (Author: eddyxu):
It looks like {SentryService} keep restarting and creating new connections to
HMS.
{noformat}
2017-03-17 18:25:44,746 INFO hive.metastore: Trying to connect to metastore
with URI thrift://node-1.network1978:9083
2017-03-17 18:25:44,764 INFO hive.metastore: Opened a connection to metastore,
current connections: 1
2017-03-17 18:25:44,807 INFO hive.metastore: Connected to metastore.
2017-03-17 18:25:44,807 INFO org.apache.sentry.service.thrift.HMSFollower: Non
secure connection established with HMS
2017-03-17 18:25:44,807 INFO org.apache.sentry.service.thrift.HMSFollower:
HMSFollower of Sentry successfully connected to HMS
2017-03-17 18:25:44,823 INFO org.apache.sentry.service.thrift.HMSFollower:
Before fetching hive full snapshot, Current NotificationID = 995.
2017-03-17 18:25:46,075 ERROR org.apache.sentry.hdfs.FullUpdateInitializer:
Task did not complete successfully after 0 tries. Exception got:
org.apache.thrift.TApplicationException: get_partition_names failed: out of
sequence response
2017-03-17 18:25:46,076 ERROR org.apache.sentry.hdfs.FullUpdateInitializer:
Task did not complete successfully after 0 tries. Exception got:
org.apache.thrift.TApplicationException: get_partitions_by_names failed: out of
sequence response
2017-03-17 18:25:46,834 ERROR org.apache.sentry.service.thrift.HMSFollower:
Exception occurred persisting Hive full snapshot into DB
java.lang.RuntimeException: org.apache.thrift.TApplicationException:
get_partition_names failed: out of sequence response
at
org.apache.sentry.hdfs.FullUpdateInitializer.createInitialUpdate(FullUpdateInitializer.java:326)
at
org.apache.sentry.service.thrift.HMSFollower.fetchFullUpdate(HMSFollower.java:317)
at
org.apache.sentry.service.thrift.HMSFollower.run(HMSFollower.java:246)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.TApplicationException: get_partition_names failed:
out of sequence response
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition_names(ThriftHiveMetastore.java:2164)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition_names(ThriftHiveMetastore.java:2149)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionNames(HiveMetaStoreClient.java:1376)
at
org.apache.sentry.hdfs.FullUpdateInitializer$TableTask.doTask(FullUpdateInitializer.java:227)
at
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask$RetryStrategy.exec(FullUpdateInitializer.java:112)
at
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:152)
at
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
... 3 more
2017-03-17 18:25:49,364 INFO org.apache.sentry.SentryMain: Configuring log4j to
use
[/var/run/cloudera-scm-agent/process/149-sentry-SENTRY_SERVER/sentry-log4j.properties]
2017-03-17 18:25:49,645 INFO org.apache.sentry.service.thrift.SentryService:
Configured on address node-1.network1978/192.168.125.2:8038
2017-03-17 18:25:49,657 INFO
org.apache.sentry.service.thrift.LeaderStatusMonitor: Leader election protocol
disabled, assuming single active server
2017-03-17 18:25:49,658 INFO org.apache.sentry.service.thrift.SentryService:
Attempting to start...
2017-03-17 18:25:49,659 INFO org.apache.sentry.service.thrift.SentryService:
HMSFollower is being configured
2017-03-17 18:25:49,667 INFO org.apache.sentry.service.thrift.HMSFollower:
HMSFollower is being initialized
2017-03-17 18:25:50,108 INFO DataNucleus.Persistence: Property
datanucleus.cache.level2 unknown - will be ignored
{noformat}
[~kkalyan] Is there a way that I verify the version running here includes
SENTRY-1628? The error message here does not match the error message after
SENTRY-1628. For example, there after SENTRY-1628, there is a piece of error
handling here in {{HMSFollower#run()}}, where I believe that the
{{RuntimeException}} thrown from
{{FullUpdateInitializer.createInitialUpdate()}} should be handled.
{code}
} catch (Throwable t) {
// catching errors to prevent the executor to halt.
LOGGER.error("Caught unexpected exception in HMSFollower!", t.getCause());
}
{code}
> When HMSFollower exits in an abnormal way, encounter out of sequence error
> for the following connection.
> --------------------------------------------------------------------------------------------------------
>
> Key: SENTRY-1630
> URL: https://issues.apache.org/jira/browse/SENTRY-1630
> Project: Sentry
> Issue Type: Sub-task
> Components: Hdfs Plugin
> Affects Versions: sentry-ha-redesign
> Reporter: Hao Hao
> Assignee: Lei (Eddy) Xu
> Fix For: sentry-ha-redesign
>
>
> When HMSFollower exits in an abnormal way, for all the following connection
> would encounter out of sequence and SocketTimeoutException: Read timed out.
> Looking at HIVE-6893, it seems to relate to leakage connection problem.
> {noformat}2017-02-15 19:03:42,822 ERROR
> org.apache.sentry.hdfs.FullUpdateInitializer: Task did not complete
> successfully after 0 tries. Exception got:
> org.apache.thrift.TApplicationException: get_database failed: out of sequence
> response
> 2017-02-15 19:03:42,827 ERROR org.apache.sentry.hdfs.FullUpdateInitializer:
> Task did not complete successfully after 0 tries. Exception got:
> MetaException(message:Got exception:
> org.apache.thrift.transport.TTransportException null)
> 2017-02-15 19:03:43,803 INFO hive.metastore: Closed a connection to
> metastore, current connections: 0
> 2017-02-15 19:03:43,803 ERROR org.apache.sentry.service.thrift.HMSFollower:
> Exception occurred persisting Hive full snapshot into DB
> java.lang.RuntimeException: org.apache.thrift.TApplicationException:
> get_database failed: out of sequence response
> at
> org.apache.sentry.hdfs.FullUpdateInitializer.createInitialUpdate(FullUpdateInitializer.java:324)
> at
> org.apache.sentry.service.thrift.HMSFollower.fetchFullUpdate(HMSFollower.java:343)
> at
> org.apache.sentry.service.thrift.HMSFollower.run(HMSFollower.java:244)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.TApplicationException: get_database failed: out
> of sequence response
> at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:662)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:649)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:1213)
> at
> org.apache.sentry.hdfs.FullUpdateInitializer$DbTask.doTask(FullUpdateInitializer.java:256)
> at
> org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask$RetryStrategy.exec(FullUpdateInitializer.java:110)
> at
> org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:150)
> at
> org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:79)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> ... 3 more
> 2017-02-15 19:03:43,849 INFO org.apache.sentry.service.thrift.HMSFollower:
> Making a kerberos connection to HMS{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)