[jira] [Comment Edited] (SENTRY-1630) When HMSFollower exits in an abnormal way, encounter out of sequence error for the following connection.

Lei (Eddy) Xu (JIRA) Mon, 20 Mar 2017 14:44:06 -0700

    [ 
https://issues.apache.org/jira/browse/SENTRY-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933633#comment-15933633
 ]


Lei (Eddy) Xu edited comment on SENTRY-1630 at 3/20/17 9:42 PM:
----------------------------------------------------------------

It looks like that {{SentryService}} keeps restarting and creating new 
connections to HMS.

{noformat}
2017-03-17 18:25:44,746 INFO hive.metastore: Trying to connect to metastore 
with URI thrift://node-1.network1978:9083
2017-03-17 18:25:44,764 INFO hive.metastore: Opened a connection to metastore, 
current connections: 1
2017-03-17 18:25:44,807 INFO hive.metastore: Connected to metastore.
2017-03-17 18:25:44,807 INFO org.apache.sentry.service.thrift.HMSFollower: Non 
secure connection established with HMS
2017-03-17 18:25:44,807 INFO org.apache.sentry.service.thrift.HMSFollower: 
HMSFollower of Sentry successfully connected to HMS
2017-03-17 18:25:44,823 INFO org.apache.sentry.service.thrift.HMSFollower: 
Before fetching hive full snapshot, Current NotificationID = 995.
2017-03-17 18:25:46,075 ERROR org.apache.sentry.hdfs.FullUpdateInitializer: 
Task did not complete successfully after 0 tries. Exception got: 
org.apache.thrift.TApplicationException: get_partition_names failed: out of 
sequence response
2017-03-17 18:25:46,076 ERROR org.apache.sentry.hdfs.FullUpdateInitializer: 
Task did not complete successfully after 0 tries. Exception got: 
org.apache.thrift.TApplicationException: get_partitions_by_names failed: out of 
sequence response
2017-03-17 18:25:46,834 ERROR org.apache.sentry.service.thrift.HMSFollower: 
Exception occurred persisting Hive full snapshot into DB
java.lang.RuntimeException: org.apache.thrift.TApplicationException: 
get_partition_names failed: out of sequence response
        at 
org.apache.sentry.hdfs.FullUpdateInitializer.createInitialUpdate(FullUpdateInitializer.java:326)
        at 
org.apache.sentry.service.thrift.HMSFollower.fetchFullUpdate(HMSFollower.java:317)
        at 
org.apache.sentry.service.thrift.HMSFollower.run(HMSFollower.java:246)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.TApplicationException: get_partition_names failed: 
out of sequence response
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition_names(ThriftHiveMetastore.java:2164)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition_names(ThriftHiveMetastore.java:2149)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionNames(HiveMetaStoreClient.java:1376)
        at 
org.apache.sentry.hdfs.FullUpdateInitializer$TableTask.doTask(FullUpdateInitializer.java:227)
        at 
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask$RetryStrategy.exec(FullUpdateInitializer.java:112)
        at 
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:152)
        at 
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:81)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        ... 3 more
2017-03-17 18:25:49,364 INFO org.apache.sentry.SentryMain: Configuring log4j to 
use 
[/var/run/cloudera-scm-agent/process/149-sentry-SENTRY_SERVER/sentry-log4j.properties]
2017-03-17 18:25:49,645 INFO org.apache.sentry.service.thrift.SentryService: 
Configured on address node-1.network1978/192.168.125.2:8038
2017-03-17 18:25:49,657 INFO 
org.apache.sentry.service.thrift.LeaderStatusMonitor: Leader election protocol 
disabled, assuming single active server
2017-03-17 18:25:49,658 INFO org.apache.sentry.service.thrift.SentryService: 
Attempting to start...
2017-03-17 18:25:49,659 INFO org.apache.sentry.service.thrift.SentryService: 
HMSFollower is being configured
2017-03-17 18:25:49,667 INFO org.apache.sentry.service.thrift.HMSFollower: 
HMSFollower is being initialized
2017-03-17 18:25:50,108 INFO DataNucleus.Persistence: Property 
datanucleus.cache.level2 unknown - will be ignored
{noformat}

[~kkalyan] Is there a way that I verify the version running here includes 
SENTRY-1628? The error message here does not match the error message after 
SENTRY-1628. For example, there after SENTRY-1628, there is a piece of error 
handling here in {{HMSFollower#run()}}, where I believe that the 
{{RuntimeException}} thrown from 
{{FullUpdateInitializer.createInitialUpdate()}} should be handled.

{code}
 } catch (Throwable t) {
    // catching errors to prevent the executor to halt.
    LOGGER.error("Caught unexpected exception in HMSFollower!", t.getCause());
 }
{code}






was (Author: eddyxu):
It looks like {SentryService} keep restarting and creating new connections to 
HMS.

{noformat}
2017-03-17 18:25:44,746 INFO hive.metastore: Trying to connect to metastore 
with URI thrift://node-1.network1978:9083
2017-03-17 18:25:44,764 INFO hive.metastore: Opened a connection to metastore, 
current connections: 1
2017-03-17 18:25:44,807 INFO hive.metastore: Connected to metastore.
2017-03-17 18:25:44,807 INFO org.apache.sentry.service.thrift.HMSFollower: Non 
secure connection established with HMS
2017-03-17 18:25:44,807 INFO org.apache.sentry.service.thrift.HMSFollower: 
HMSFollower of Sentry successfully connected to HMS
2017-03-17 18:25:44,823 INFO org.apache.sentry.service.thrift.HMSFollower: 
Before fetching hive full snapshot, Current NotificationID = 995.
2017-03-17 18:25:46,075 ERROR org.apache.sentry.hdfs.FullUpdateInitializer: 
Task did not complete successfully after 0 tries. Exception got: 
org.apache.thrift.TApplicationException: get_partition_names failed: out of 
sequence response
2017-03-17 18:25:46,076 ERROR org.apache.sentry.hdfs.FullUpdateInitializer: 
Task did not complete successfully after 0 tries. Exception got: 
org.apache.thrift.TApplicationException: get_partitions_by_names failed: out of 
sequence response
2017-03-17 18:25:46,834 ERROR org.apache.sentry.service.thrift.HMSFollower: 
Exception occurred persisting Hive full snapshot into DB
java.lang.RuntimeException: org.apache.thrift.TApplicationException: 
get_partition_names failed: out of sequence response
        at 
org.apache.sentry.hdfs.FullUpdateInitializer.createInitialUpdate(FullUpdateInitializer.java:326)
        at 
org.apache.sentry.service.thrift.HMSFollower.fetchFullUpdate(HMSFollower.java:317)
        at 
org.apache.sentry.service.thrift.HMSFollower.run(HMSFollower.java:246)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.TApplicationException: get_partition_names failed: 
out of sequence response
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition_names(ThriftHiveMetastore.java:2164)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition_names(ThriftHiveMetastore.java:2149)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionNames(HiveMetaStoreClient.java:1376)
        at 
org.apache.sentry.hdfs.FullUpdateInitializer$TableTask.doTask(FullUpdateInitializer.java:227)
        at 
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask$RetryStrategy.exec(FullUpdateInitializer.java:112)
        at 
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:152)
        at 
org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:81)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        ... 3 more
2017-03-17 18:25:49,364 INFO org.apache.sentry.SentryMain: Configuring log4j to 
use 
[/var/run/cloudera-scm-agent/process/149-sentry-SENTRY_SERVER/sentry-log4j.properties]
2017-03-17 18:25:49,645 INFO org.apache.sentry.service.thrift.SentryService: 
Configured on address node-1.network1978/192.168.125.2:8038
2017-03-17 18:25:49,657 INFO 
org.apache.sentry.service.thrift.LeaderStatusMonitor: Leader election protocol 
disabled, assuming single active server
2017-03-17 18:25:49,658 INFO org.apache.sentry.service.thrift.SentryService: 
Attempting to start...
2017-03-17 18:25:49,659 INFO org.apache.sentry.service.thrift.SentryService: 
HMSFollower is being configured
2017-03-17 18:25:49,667 INFO org.apache.sentry.service.thrift.HMSFollower: 
HMSFollower is being initialized
2017-03-17 18:25:50,108 INFO DataNucleus.Persistence: Property 
datanucleus.cache.level2 unknown - will be ignored
{noformat}

[~kkalyan] Is there a way that I verify the version running here includes 
SENTRY-1628? The error message here does not match the error message after 
SENTRY-1628. For example, there after SENTRY-1628, there is a piece of error 
handling here in {{HMSFollower#run()}}, where I believe that the 
{{RuntimeException}} thrown from 
{{FullUpdateInitializer.createInitialUpdate()}} should be handled.

{code}
 } catch (Throwable t) {
    // catching errors to prevent the executor to halt.
    LOGGER.error("Caught unexpected exception in HMSFollower!", t.getCause());
 }
{code}





> When HMSFollower exits in an abnormal way, encounter out of sequence error 
> for the following connection.
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SENTRY-1630
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1630
>             Project: Sentry
>          Issue Type: Sub-task
>          Components: Hdfs Plugin
>    Affects Versions: sentry-ha-redesign
>            Reporter: Hao Hao
>            Assignee: Lei (Eddy) Xu
>             Fix For: sentry-ha-redesign
>
>
> When HMSFollower exits in an abnormal way, for all the following connection 
> would encounter out of sequence and SocketTimeoutException: Read timed out. 
> Looking at HIVE-6893, it seems to relate to leakage connection problem.
> {noformat}2017-02-15 19:03:42,822 ERROR 
> org.apache.sentry.hdfs.FullUpdateInitializer: Task did not complete 
> successfully after 0 tries. Exception got: 
> org.apache.thrift.TApplicationException: get_database failed: out of sequence 
> response
> 2017-02-15 19:03:42,827 ERROR org.apache.sentry.hdfs.FullUpdateInitializer: 
> Task did not complete successfully after 0 tries. Exception got: 
> MetaException(message:Got exception: 
> org.apache.thrift.transport.TTransportException null)
> 2017-02-15 19:03:43,803 INFO hive.metastore: Closed a connection to 
> metastore, current connections: 0
> 2017-02-15 19:03:43,803 ERROR org.apache.sentry.service.thrift.HMSFollower: 
> Exception occurred persisting Hive full snapshot into DB
> java.lang.RuntimeException: org.apache.thrift.TApplicationException: 
> get_database failed: out of sequence response
>         at 
> org.apache.sentry.hdfs.FullUpdateInitializer.createInitialUpdate(FullUpdateInitializer.java:324)
>         at 
> org.apache.sentry.service.thrift.HMSFollower.fetchFullUpdate(HMSFollower.java:343)
>         at 
> org.apache.sentry.service.thrift.HMSFollower.run(HMSFollower.java:244)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.TApplicationException: get_database failed: out 
> of sequence response
>         at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:662)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:649)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:1213)
>         at 
> org.apache.sentry.hdfs.FullUpdateInitializer$DbTask.doTask(FullUpdateInitializer.java:256)
>         at 
> org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask$RetryStrategy.exec(FullUpdateInitializer.java:110)
>         at 
> org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:150)
>         at 
> org.apache.sentry.hdfs.FullUpdateInitializer$BaseTask.call(FullUpdateInitializer.java:79)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         ... 3 more
> 2017-02-15 19:03:43,849 INFO org.apache.sentry.service.thrift.HMSFollower: 
> Making a kerberos connection to HMS{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (SENTRY-1630) When HMSFollower exits in an abnormal way, encounter out of sequence error for the following connection.

Reply via email to