[
https://issues.apache.org/jira/browse/IMPALA-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120044#comment-17120044
]
Tim Armstrong commented on IMPALA-8990:
---------------------------------------
I saw this failure happen again. In the coordinator log, it looks like this
test isn't actually behaving as intended - the sleep(1000) query can actually
get admitted after the "select 1" query. Here are the logs including both the
query
{noformat}
$ grep -E '944ff1b6efb27c7f|d54fff57a02c497d'
impalad.impala-ec2-centos74-m5-4xlarge-ondemand-03e9.vpc.cloudera.com.jenkins.log.INFO.20200529-032359.16305
I0529 03:24:08.040937 17539 impala-server.cc:1079]
944ff1b6efb27c7f:82e927d800000000] Registered query
query_id=944ff1b6efb27c7f:82e927d800000000
session_id=ba4d93381c32d981:e55d763dd66f2b85
I0529 03:24:08.041055 17539 Frontend.java:1498]
944ff1b6efb27c7f:82e927d800000000] Analyzing query: select sleep(1000) db:
default
I0529 03:24:08.046777 17539 BaseAuthorizationChecker.java:110]
944ff1b6efb27c7f:82e927d800000000] Authorization check took 5 ms
I0529 03:24:08.046802 17539 Frontend.java:1540]
944ff1b6efb27c7f:82e927d800000000] Analysis and authorization finished.
I0529 03:24:08.054709 17539 impala-server.cc:1117] Query
944ff1b6efb27c7f:82e927d800000000 has idle timeout of 5s000ms
I0529 03:24:08.056697 17539 impala-server.cc:1079]
d54fff57a02c497d:940d4f9000000000] Registered query
query_id=d54fff57a02c497d:940d4f9000000000
session_id=ba4d93381c32d981:e55d763dd66f2b85
I0529 03:24:08.056809 17539 Frontend.java:1498]
d54fff57a02c497d:940d4f9000000000] Analyzing query: select 1 db: default
I0529 03:24:08.057070 17539 BaseAuthorizationChecker.java:110]
d54fff57a02c497d:940d4f9000000000] Authorization check took 0 ms
I0529 03:24:08.057091 17539 Frontend.java:1540]
d54fff57a02c497d:940d4f9000000000] Analysis and authorization finished.
I0529 03:24:08.058184 17539 impala-server.cc:1117] Query
d54fff57a02c497d:940d4f9000000000 has idle timeout of 5s000ms
I0529 03:24:08.066995 17561 admission-controller.cc:1205]
d54fff57a02c497d:940d4f9000000000] Scheduling query
d54fff57a02c497d:940d4f9000000000 with membership version 4
I0529 03:24:08.067004 17561 admission-controller.cc:1855]
d54fff57a02c497d:940d4f9000000000] Checking default executor group for pool
root.queueA
I0529 03:24:08.067009 17561 admission-controller.cc:1245]
d54fff57a02c497d:940d4f9000000000] Scheduling for executor group: default with
3 executors
I0529 03:24:08.067032 17561 admission-controller.cc:1290]
d54fff57a02c497d:940d4f9000000000] Trying to admit query to pool root.queueA in
executor group default (3 executors)
I0529 03:24:08.067044 17561 admission-controller.cc:1296]
d54fff57a02c497d:940d4f9000000000] Trying to admit
id=d54fff57a02c497d:940d4f9000000000 in pool_name=root.queueA
executor_group_name=default per_host_mem_estimate=10.00 MB
dedicated_coord_mem_estimate=100.00 MB max_requests=1 (configured statically)
max_queued=30 (configured statically) max_mem=97.66 GB (configured statically)
I0529 03:24:08.067073 17561 admission-controller.cc:1308]
d54fff57a02c497d:940d4f9000000000] Stats: agg_num_running=0, agg_num_queued=0,
agg_mem_reserved=0, local_host(local_mem_admitted=0, num_admitted_running=0,
num_queued=0, backend_mem_reserved=0)
I0529 03:24:08.067078 17561 admission-controller.cc:775]
d54fff57a02c497d:940d4f9000000000] Checking executor mem with
executor_mem_to_admit = 0 and min_admit_mem_limit.second = 9223372036854775807
I0529 03:24:08.067081 17561 admission-controller.cc:787]
d54fff57a02c497d:940d4f9000000000] Checking coordinator mem with
coord_mem_to_admit = 134217728 and coord_admit_mem_limit.second = 12884901888
I0529 03:24:08.067104 17561 admission-controller.cc:519]
d54fff57a02c497d:940d4f9000000000] Checking agg mem in pool=root.queueA :
agg_num_running=0, agg_num_queued=0, agg_mem_reserved=0,
local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0,
backend_mem_reserved=0) executor_group=default cluster_mem_needed=128.00 MB
pool_max_mem=97.66 GB (configured statically)
I0529 03:24:08.067126 17561 admission-controller.cc:542]
d54fff57a02c497d:940d4f9000000000] Checking memory on
host=impala-ec2-centos74-m5-4xlarge-ondemand-03e9.vpc.cloudera.com:22000
mem_reserved=0 mem_admitted=0 needs=128.00 MB admit_mem_limit=12.00 GB
I0529 03:24:08.067129 17561 admission-controller.cc:884]
d54fff57a02c497d:940d4f9000000000] Can admit to group default (or cancelled)
I0529 03:24:08.067132 17561 admission-controller.cc:894]
d54fff57a02c497d:940d4f9000000000] Admitting query
id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.067138 17561 admission-controller.cc:1544]
d54fff57a02c497d:940d4f9000000000] For Query d54fff57a02c497d:940d4f9000000000
per_backend_mem_limit set to: 0 per_backend_mem_to_admit set to: 0
coord_backend_mem_limit set to: 128.00 MB coord_backend_mem_to_admit set to:
128.00 MB
I0529 03:24:08.067144 17561 admission-controller.cc:456]
d54fff57a02c497d:940d4f9000000000] Update admitted mem reserved for
host=impala-ec2-centos74-m5-4xlarge-ondemand-03e9.vpc.cloudera.com:22000 prev=0
new=128.00 MB
I0529 03:24:08.067147 17561 admission-controller.cc:461]
d54fff57a02c497d:940d4f9000000000] Update admitted queries for
host=impala-ec2-centos74-m5-4xlarge-ondemand-03e9.vpc.cloudera.com:22000 prev=0
new=1
I0529 03:24:08.067150 17561 admission-controller.cc:466]
d54fff57a02c497d:940d4f9000000000] Update slots in use for
host=impala-ec2-centos74-m5-4xlarge-ondemand-03e9.vpc.cloudera.com:22000 prev=0
new=1
I0529 03:24:08.067178 17561 admission-controller.cc:897]
d54fff57a02c497d:940d4f9000000000] Final: agg_num_running=1, agg_num_queued=0,
agg_mem_reserved=0, local_host(local_mem_admitted=128.00 MB,
num_admitted_running=1, num_queued=0, backend_mem_reserved=0)
I0529 03:24:08.067183 17561 impala-server.cc:1851]
d54fff57a02c497d:940d4f9000000000] Registering query locations
I0529 03:24:08.067188 17561 coordinator.cc:143]
d54fff57a02c497d:940d4f9000000000] Exec()
query_id=d54fff57a02c497d:940d4f9000000000 stmt=select 1
I0529 03:24:08.067229 17561 coordinator.cc:463]
d54fff57a02c497d:940d4f9000000000] starting execution on 1 backends for
query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.067288 17559 admission-controller.cc:1205]
944ff1b6efb27c7f:82e927d800000000] Scheduling query
944ff1b6efb27c7f:82e927d800000000 with membership version 4
I0529 03:24:08.067302 17559 admission-controller.cc:1855]
944ff1b6efb27c7f:82e927d800000000] Checking default executor group for pool
root.queueA
I0529 03:24:08.067310 17559 admission-controller.cc:1245]
944ff1b6efb27c7f:82e927d800000000] Scheduling for executor group: default with
3 executors
I0529 03:24:08.067340 17559 admission-controller.cc:1290]
944ff1b6efb27c7f:82e927d800000000] Trying to admit query to pool root.queueA in
executor group default (3 executors)
I0529 03:24:08.067358 17559 admission-controller.cc:1296]
944ff1b6efb27c7f:82e927d800000000] Trying to admit
id=944ff1b6efb27c7f:82e927d800000000 in pool_name=root.queueA
executor_group_name=default per_host_mem_estimate=10.00 MB
dedicated_coord_mem_estimate=100.00 MB max_requests=1 (configured statically)
max_queued=30 (configured statically) max_mem=97.66 GB (configured statically)
I0529 03:24:08.067370 16597 control-service.cc:153]
d54fff57a02c497d:940d4f9000000000] ExecQueryFInstances():
query_id=d54fff57a02c497d:940d4f9000000000
coord=impala-ec2-centos74-m5-4xlarge-ondemand-03e9.vpc.cloudera.com:22000
#instances=1
I0529 03:24:08.067375 17559 admission-controller.cc:1308]
944ff1b6efb27c7f:82e927d800000000] Stats: agg_num_running=1, agg_num_queued=0,
agg_mem_reserved=0, local_host(local_mem_admitted=128.00 MB,
num_admitted_running=1, num_queued=0, backend_mem_reserved=0)
I0529 03:24:08.067381 17559 admission-controller.cc:775]
944ff1b6efb27c7f:82e927d800000000] Checking executor mem with
executor_mem_to_admit = 0 and min_admit_mem_limit.second = 9223372036854775807
I0529 03:24:08.067386 17559 admission-controller.cc:787]
944ff1b6efb27c7f:82e927d800000000] Checking coordinator mem with
coord_mem_to_admit = 134217728 and coord_admit_mem_limit.second = 12884901888
I0529 03:24:08.067404 17559 admission-controller.cc:1323]
944ff1b6efb27c7f:82e927d800000000] Cannot admit query
TUniqueId(hi=-7759717864928281473, lo=-9013629370538917888) to group default:
number of running queries 1 is at or over limit 1 (configured statically).
I0529 03:24:08.067414 17559 admission-controller.cc:903]
944ff1b6efb27c7f:82e927d800000000] Queuing, query
id=944ff1b6efb27c7f:82e927d800000000 reason: number of running queries 1 is at
or over limit 1 (configured statically).
I0529 03:24:08.067530 17561 coordinator.cc:520]
d54fff57a02c497d:940d4f9000000000] started execution on 1 backends for
query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.067569 17564 query-state.cc:747]
d54fff57a02c497d:940d4f9000000000] Executing instance.
instance_id=d54fff57a02c497d:940d4f9000000000 fragment_idx=0
per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=1
I0529 03:24:08.108675 17539 impala-beeswax-server.cc:209]
get_results_metadata(): query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.108839 17564 krpc-data-stream-mgr.cc:337]
d54fff57a02c497d:940d4f9000000000] cancelling active streams for
fragment_instance_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.108855 17564 query-state.cc:756]
d54fff57a02c497d:940d4f9000000000] Instance completed.
instance_id=d54fff57a02c497d:940d4f9000000000 #in-flight=0 status=OK
I0529 03:24:08.108871 17563 query-state.cc:380]
d54fff57a02c497d:940d4f9000000000] UpdateBackendExecState(): last report for
d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.109004 17539 coordinator.cc:644] ExecState: query
id=d54fff57a02c497d:940d4f9000000000 execution completed
I0529 03:24:08.109011 17539 coordinator-backend-state.cc:934]
query_id=d54fff57a02c497d:940d4f9000000000 target backend=127.0.0.1:27000:
Sending CancelQueryFInstances rpc
I0529 03:24:08.109072 16596 control-service.cc:223] CancelQueryFInstances():
query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.109081 16596 query-exec-mgr.cc:105] QueryState:
query_id=d54fff57a02c497d:940d4f9000000000 refcnt=3
I0529 03:24:08.109083 16596 query-state.cc:773] Cancel:
query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.109087 16596 krpc-data-stream-mgr.cc:337] cancelling active
streams for fragment_instance_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.109136 17539 coordinator.cc:889] CancelBackends()
query_id=d54fff57a02c497d:940d4f9000000000, tried to cancel 1 backends
I0529 03:24:08.109139 17539 coordinator.cc:790] Coordinator waiting for
backends to finish, 1 remaining. query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.109179 16597 coordinator.cc:922] Backend completed:
host=impala-ec2-centos74-m5-4xlarge-ondemand-03e9.vpc.cloudera.com:22000
remaining=1 query_id=d54fff57a02c497d:940d4f9000000000
} for query id=d54fff57a02c497d:940d4f9000000000 agg_num_running=1,
agg_num_queued=1, agg_mem_reserved=0, local_host(local_mem_admitted=0,
num_admitted_running=1, num_queued=1, backend_mem_reserved=0)
I0529 03:24:08.109234 17563 query-state.cc:543]
d54fff57a02c497d:940d4f9000000000] Cancelling fragment instances as directed by
the coordinator. Returned status: Cancelled
I0529 03:24:08.109238 17539 coordinator.cc:1225] Release admission control
resources for query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.109241 17563 query-state.cc:773]
d54fff57a02c497d:940d4f9000000000] Cancel:
query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.109251 16800 admission-controller.cc:1517] Could not dequeue
query id=944ff1b6efb27c7f:82e927d800000000 reason: number of running queries 1
is at or over limit 1 (configured statically).
I0529 03:24:08.109274 17539 admission-controller.cc:995] Released query
id=d54fff57a02c497d:940d4f9000000000 agg_num_running=0, agg_num_queued=1,
agg_mem_reserved=0, local_host(local_mem_admitted=0, num_admitted_running=0,
num_queued=1, backend_mem_reserved=0)
I0529 03:24:08.109304 16800 admission-controller.cc:1296] Trying to admit
id=944ff1b6efb27c7f:82e927d800000000 in pool_name=root.queueA
executor_group_name=default per_host_mem_estimate=10.00 MB
dedicated_coord_mem_estimate=100.00 MB max_requests=1 (configured statically)
max_queued=30 (configured statically) max_mem=97.66 GB (configured statically)
I0529 03:24:08.109344 16800 admission-controller.cc:1451] Admitting from queue:
query=944ff1b6efb27c7f:82e927d800000000
I0529 03:24:08.109355 16800 admission-controller.cc:1544] For Query
944ff1b6efb27c7f:82e927d800000000 per_backend_mem_limit set to: 0
per_backend_mem_to_admit set to: 0 coord_backend_mem_limit set to: 128.00 MB
coord_backend_mem_to_admit set to: 128.00 MB
I0529 03:24:08.109388 17559 admission-controller.cc:976]
944ff1b6efb27c7f:82e927d800000000] Admitted queued query
id=944ff1b6efb27c7f:82e927d800000000
I0529 03:24:08.109422 17559 admission-controller.cc:977]
944ff1b6efb27c7f:82e927d800000000] Final: agg_num_running=1, agg_num_queued=0,
agg_mem_reserved=0, local_host(local_mem_admitted=128.00 MB,
num_admitted_running=1, num_queued=0, backend_mem_reserved=0)
I0529 03:24:08.109431 17559 impala-server.cc:1851]
944ff1b6efb27c7f:82e927d800000000] Registering query locations
I0529 03:24:08.109452 17559 coordinator.cc:143]
944ff1b6efb27c7f:82e927d800000000] Exec()
query_id=944ff1b6efb27c7f:82e927d800000000 stmt=select sleep(1000)
I0529 03:24:08.109529 17559 coordinator.cc:463]
944ff1b6efb27c7f:82e927d800000000] starting execution on 1 backends for
query_id=944ff1b6efb27c7f:82e927d800000000
I0529 03:24:08.109673 16597 control-service.cc:153]
944ff1b6efb27c7f:82e927d800000000] ExecQueryFInstances():
query_id=944ff1b6efb27c7f:82e927d800000000
coord=impala-ec2-centos74-m5-4xlarge-ondemand-03e9.vpc.cloudera.com:22000
#instances=1
I0529 03:24:08.109819 17559 coordinator.cc:520]
944ff1b6efb27c7f:82e927d800000000] started execution on 1 backends for
query_id=944ff1b6efb27c7f:82e927d800000000
I0529 03:24:08.109849 17566 query-state.cc:747]
944ff1b6efb27c7f:82e927d800000000] Executing instance.
instance_id=944ff1b6efb27c7f:82e927d800000000 fragment_idx=0
per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=1
I0529 03:24:08.110978 17539 impala-beeswax-server.cc:260] close():
query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.110983 17539 impala-server.cc:1157] UnregisterQuery():
query_id=d54fff57a02c497d:940d4f9000000000
I0529 03:24:08.110987 17539 coordinator-backend-state.cc:934]
query_id=d54fff57a02c497d:940d4f9000000000 target backend=127.0.0.1:27000: Not
cancelling because the backend is already done:
I0529 03:24:08.110992 17539 coordinator.cc:889] CancelBackends()
query_id=d54fff57a02c497d:940d4f9000000000, tried to cancel 0 backends
I0529 03:24:08.111210 17539 impala-server.cc:1157] UnregisterQuery():
query_id=944ff1b6efb27c7f:82e927d800000000
I0529 03:24:08.111218 17539 coordinator.cc:644] ExecState: query
id=944ff1b6efb27c7f:82e927d800000000 execution cancelled
I0529 03:24:08.111222 17539 coordinator-backend-state.cc:934]
query_id=944ff1b6efb27c7f:82e927d800000000 target backend=127.0.0.1:27000:
Sending CancelQueryFInstances rpc
I0529 03:24:08.111292 16597 control-service.cc:223] CancelQueryFInstances():
query_id=944ff1b6efb27c7f:82e927d800000000
I0529 03:24:08.111299 16597 query-exec-mgr.cc:105] QueryState:
query_id=944ff1b6efb27c7f:82e927d800000000 refcnt=4
I0529 03:24:08.111300 16597 query-state.cc:773] Cancel:
query_id=944ff1b6efb27c7f:82e927d800000000
I0529 03:24:08.111304 16597 krpc-data-stream-mgr.cc:337] cancelling active
streams for fragment_instance_id=944ff1b6efb27c7f:82e927d800000000
I0529 03:24:08.111348 17539 coordinator.cc:889] CancelBackends()
query_id=944ff1b6efb27c7f:82e927d800000000, tried to cancel 1 backends
} for query id=944ff1b6efb27c7f:82e927d800000000 agg_num_running=1,
agg_num_queued=0, agg_mem_reserved=0, local_host(local_mem_admitted=0,
num_admitted_running=1, num_queued=0, backend_mem_reserved=0)
I0529 03:24:08.111395 17539 coordinator.cc:1225] Release admission control
resources for query_id=944ff1b6efb27c7f:82e927d800000000
I0529 03:24:08.111403 17539 admission-controller.cc:995] Released query
id=944ff1b6efb27c7f:82e927d800000000 agg_num_running=0, agg_num_queued=0,
agg_mem_reserved=0, local_host(local_mem_admitted=0, num_admitted_running=0,
num_queued=0, backend_mem_reserved=0)
I0529 03:24:08.111718 16816 query-exec-mgr.cc:192] ReleaseQueryState(): deleted
query_id=d54fff57a02c497d:940d4f9000000000
{noformat}
> TestAdmissionController.test_set_request_pool seems flaky
> ---------------------------------------------------------
>
> Key: IMPALA-8990
> URL: https://issues.apache.org/jira/browse/IMPALA-8990
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.4.0
> Reporter: Michael Ho
> Assignee: Bikramjeet Vig
> Priority: Critical
> Labels: flaky
>
> Expected query error didn't occur. Happened once so far. [~bikram], can you
> please take a look ?
> {noformat}
> Error Message
> AssertionError: Query should return error assert False
> Stacktrace
> hs2/hs2_test_suite.py:63: in add_session
> lambda: fn(self))
> hs2/hs2_test_suite.py:44: in add_session_helper
> fn()
> hs2/hs2_test_suite.py:63: in <lambda>
> lambda: fn(self))
> custom_cluster/test_admission_controller.py:312: in test_set_request_pool
> self.__check_pool_rejected(client, 'root.queueA', "exceeded timeout")
> custom_cluster/test_admission_controller.py:195: in __check_pool_rejected
> assert False, "Query should return error"
> E AssertionError: Query should return error
> E assert False
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]