[
https://issues.apache.org/jira/browse/IMPALA-10259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17253728#comment-17253728
]
Wenzhe Zhou commented on IMPALA-10259:
--------------------------------------
This issue happened recently in cdpd-master core ASAN build:
[https://master-02.jenkins.cloudera.com/job/impala-cdpd-master-core-asan/497/]
It happened when executing test case:
tests/query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling_naaj.
According to log messages, coordinator scheduled query fragment instances on
three executors, which were listening on port 27000, 27001, 27002 respectively.
The second executor, which was listening on port 27001, sent status report to
coordinator with instance_exec_status as done to make
BackendState::num_remaining_instances_ decrease to 0 so that coordinator change
the corresponding BackendState as done. Later, the executor got error "Error
from query 0b4f6b681d27f917:a71a3c0b00000000: Memory limit exceeded: Could not
free memory by spilling to disk: scratch_limit is 0", and sent last status
report to the coordinator. But the coordinator did not apply the exec status
report and returned response with Status::OK since the BackendState was marked
as done. This made the executor to hit DCHECK failure.
Here are the log messages on three impalad:
*Log messages on executor
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000*
I1213 21:16:24.329084 16251 impala-server.cc:1212]
0b4f6b681d27f917:a71a3c0b00000000] Registered query
query_id=0b4f6b681d27f917:a71a3c0b00000000
session_id=f6448bd69bd96332:83f4be5c320631b9
I1213 21:16:24.329277 16251 Frontend.java:1522]
0b4f6b681d27f917:a71a3c0b00000000] Analyzing query: select l_orderkey,
l_partkey, l_suppkey, l_linenumber
from lineitem
where if(l_orderkey % 2 = 0, NULL, l_partkey) not in (
select p_partkey
from part
where p_retailprice != l_extendedprice * l_tax)
order by 1,2,3,4 limit 5 db: tpch_parquet
I1213 21:16:24.331490 16251 BaseAuthorizationChecker.java:110]
0b4f6b681d27f917:a71a3c0b00000000] Authorization check took 2 ms
I1213 21:16:24.331568 16251 Frontend.java:1564]
0b4f6b681d27f917:a71a3c0b00000000] Analysis and authorization finished.
I1213 21:16:24.336283 30797 admission-controller.cc:1630]
0b4f6b681d27f917:a71a3c0b00000000] Trying to admit
id=0b4f6b681d27f917:a71a3c0b00000000 in pool_name=default-pool
executor_group_name=default per_host_mem_estimate=520.59 MB
dedicated_coord_mem_estimate=100.02 MB max_requests=-1 max_queued=200
max_mem=-1.00 B
I1213 21:16:24.336367 30797 admission-controller.cc:1638]
0b4f6b681d27f917:a71a3c0b00000000] Stats: agg_num_running=3, agg_num_queued=0,
agg_mem_reserved=9.72 GB, local_host(local_mem_admitted=248.80 GB,
num_admitted_running=3, num_queued=0, backend_mem_reserved=139.99 KB,
topN_query_stats: queries=[3f4a3ecbde524632:bd18d56c00000000,
874bc9e6299fc558:6a5dfa3d00000000, 164f449fbd3220c6:3ee5b1c800000000,
b440cb236aa701ee:f6fa8b0700000000], total_mem_consumed=139.99 KB,
fraction_of_pool_total_mem=1; pool_level_stats: num_running=4, min=11.29 KB,
max=59.36 KB, pool_total_mem=139.99 KB, average_per_query=35.00 KB)
I1213 21:16:24.336488 30797 admission-controller.cc:1185]
0b4f6b681d27f917:a71a3c0b00000000] Admitting query
id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:24.336814 30797 impala-server.cc:2032]
0b4f6b681d27f917:a71a3c0b00000000] Registering query locations
I1213 21:16:24.336901 30797 coordinator.cc:148]
0b4f6b681d27f917:a71a3c0b00000000] Exec()
query_id=0b4f6b681d27f917:a71a3c0b00000000 stmt=select l_orderkey, l_partkey,
l_suppkey, l_linenumber
from lineitem
where if(l_orderkey % 2 = 0, NULL, l_partkey) not in (
select p_partkey
from part
where p_retailprice != l_extendedprice * l_tax)
order by 1,2,3,4 limit 5
I1213 21:16:24.338374 30797 coordinator.cc:472]
0b4f6b681d27f917:a71a3c0b00000000] starting execution on 3 backends for
query_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:24.340186 23828 control-service.cc:142]
0b4f6b681d27f917:a71a3c0b00000000] ExecQueryFInstances():
query_id=0b4f6b681d27f917:a71a3c0b00000000
coord=impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000
#instances=2
I1213 21:16:24.342927 30801 query-state.cc:897]
0b4f6b681d27f917:a71a3c0b00000003] Executing instance.
instance_id=0b4f6b681d27f917:a71a3c0b00000003 fragment_idx=1
per_fragment_instance_idx=1 coord_state_idx=0 #in-flight=4
I1213 21:16:24.343096 30802 query-state.cc:897]
0b4f6b681d27f917:a71a3c0b00000000] Executing instance.
instance_id=0b4f6b681d27f917:a71a3c0b00000000 fragment_idx=0
per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=5
I1213 21:16:24.343809 30797 coordinator.cc:531]
0b4f6b681d27f917:a71a3c0b00000000] started execution on 3 backends for
query_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:24.395705 30807 admission-controller.cc:1638]
1e4895c902a7df9d:ea55d7b700000000] Stats: agg_num_running=3, agg_num_queued=0,
agg_mem_reserved=9.78 GB, local_host(local_mem_admitted=250.31 GB,
num_admitted_running=3, num_queued=0, backend_mem_reserved=29.65 MB,
topN_query_stats: queries=[0b4f6b681d27f917:a71a3c0b00000000,
3f4a3ecbde524632:bd18d56c00000000, c946a8029dce14c0:a87471b400000000,
b440cb236aa701ee:f6fa8b0700000000], total_mem_consumed=29.65 MB,
fraction_of_pool_total_mem=1; pool_level_stats: num_running=4, min=11.29 KB,
max=29.56 MB, pool_total_mem=29.65 MB, average_per_query=7.41 MB)
I1213 21:16:24.465718 30813 admission-controller.cc:1638]
8f49abb9e4f88fb1:898d815700000000] Stats: agg_num_running=3, agg_num_queued=0,
agg_mem_reserved=9.78 GB, local_host(local_mem_admitted=250.31 GB,
num_admitted_running=3, num_queued=0, backend_mem_reserved=29.65 MB,
topN_query_stats: queries=[0b4f6b681d27f917:a71a3c0b00000000,
3f4a3ecbde524632:bd18d56c00000000, c946a8029dce14c0:a87471b400000000,
b440cb236aa701ee:f6fa8b0700000000], total_mem_consumed=29.65 MB,
fraction_of_pool_total_mem=1; pool_level_stats: num_running=4, min=11.29 KB,
max=29.56 MB, pool_total_mem=29.65 MB, average_per_query=7.41 MB)
{color:#de350b}_--- got status report from executor on port 27001, backend
state was set as done, but status OK so coordinator did not call
UpdateExecState to update exec state._{color}
*I1213 21:16:44.394950 23824 coordinator.cc:959] Backend completed:
host=impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27001
remaining=3 query_id=0b4f6b681d27f917:a71a3c0b00000000*
I1213 21:16:44.395043 23824 coordinator-backend-state.cc:362]
query_id=0b4f6b681d27f917:a71a3c0b00000000: first in-progress backend:
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000
I1213 21:16:44.653590 2706 krpc-data-stream-mgr.cc:306]
0b4f6b681d27f917:a71a3c0b00000003] DeregisterRecvr():
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000003, node=4
I1213 21:16:45.173908 30801 runtime-state.cc:196]
0b4f6b681d27f917:a71a3c0b00000003] Error from query
0b4f6b681d27f917:a71a3c0b00000000: Memory limit exceeded: Could not free memory
by spilling to disk: scratch_limit is 0
Error occurred on backend
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:*27000* by
fragment 0b4f6b681d27f917:a71a3c0b00000003
Memory left in process limit: 11.27 GB
Query(0b4f6b681d27f917:a71a3c0b00000000): Reservation=90.00 MB
ReservationLimit=90.00 MB OtherMemory=9.70 MB Total=99.70 MB Peak=99.79 MB
Fragment 0b4f6b681d27f917:a71a3c0b00000003: Reservation=90.00 MB
OtherMemory=9.65 MB Total=99.65 MB Peak=99.74 MB
SORT_NODE (id=3): Total=6.12 MB Peak=8.05 MB
HASH_JOIN_NODE (id=2): Reservation=47.50 MB OtherMemory=83.25 KB Total=47.58
MB Peak=47.58 MB
Exprs: Total=17.12 KB Peak=17.12 KB
Hash Join Builder (join_node_id=2): Total=13.12 KB Peak=21.12 KB
Hash Join Builder (join_node_id=2) Exprs: Total=13.12 KB Peak=13.12 KB
HDFS_SCAN_NODE (id=0): Reservation=42.50 MB OtherMemory=3.44 MB Total=45.94 MB
Peak=47.18 MB
Queued Batches: Total=1.97 MB Peak=3.16 MB
EXCHANGE_NODE (id=4): Reservation=0 OtherMemory=0 Total=0 Peak=7.63 MB
KrpcDeferredRpcs: Total=0 Peak=0
KrpcDataStreamSender (dst_id=5): Total=4.41 KB Peak=4.41 KB
Fragment 0b4f6b681d27f917:a71a3c0b00000000: Reservation=0 OtherMemory=8.00 KB
Total=8.00 KB Peak=8.00 KB
EXCHANGE_NODE (id=5): Reservation=0 OtherMemory=0 Total=0 Peak=0
KrpcDeferredRpcs: Total=0 Peak=0
PLAN_ROOT_SINK: Total=0 Peak=0
CodeGen: Total=1.44 KB Peak=423.00 KB
CodeGen: Total=42.90 KB Peak=6.36 MB
I1213 21:16:45.174247 30801 krpc-data-stream-mgr.cc:337]
0b4f6b681d27f917:a71a3c0b00000003] cancelling active streams for
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000003
I1213 21:16:45.183601 30799 query-state.cc:464]
0b4f6b681d27f917:a71a3c0b00000000] UpdateBackendExecState(): last report for
0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.183585 30801 query-state.cc:906]
0b4f6b681d27f917:a71a3c0b00000003] Instance completed.
instance_id=0b4f6b681d27f917:a71a3c0b00000003 #in-flight=5
status=MEM_LIMIT_EXCEEDED: Memory limit exceeded: Could not free memory by
spilling to disk: scratch_limit is 0
Error occurred on backend
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000 by fragment
0b4f6b681d27f917:a71a3c0b00000003
Memory left in process limit: 11.27 GB
Query(0b4f6b681d27f917:a71a3c0b00000000): Reservation=90.00 MB
ReservationLimit=90.00 MB OtherMemory=9.70 MB Total=99.70 MB Peak=99.79 MB
Fragment 0b4f6b681d27f917:a71a3c0b00000003: Reservation=90.00 MB
OtherMemory=9.65 MB Total=99.65 MB Peak=99.74 MB
SORT_NODE (id=3): Total=6.12 MB Peak=8.05 MB
HASH_JOIN_NODE (id=2): Reservation=47.50 MB OtherMemory=83.25 KB Total=47.58
MB Peak=47.58 MB
Exprs: Total=17.12 KB Peak=17.12 KB
Hash Join Builder (join_node_id=2): Total=13.12 KB Peak=21.12 KB
Hash Join Builder (join_node_id=2) Exprs: Total=13.12 KB Peak=13.12 KB
HDFS_SCAN_NODE (id=0): Reservation=42.50 MB OtherMemory=3.44 MB Total=45.94 MB
Peak=47.18 MB
Queued Batches: Total=1.97 MB Peak=3.16 MB
EXCHANGE_NODE (id=4): Reservation=0 OtherMemory=0 Total=0 Peak=7.63 MB
KrpcDeferredRpcs: Total=0 Peak=0
KrpcDataStreamSender (dst_id=5): Total=4.41 KB Peak=4.41 KB
Fragment 0b4f6b681d27f917:a71a3c0b00000000: Reservation=0 OtherMemory=8.00 KB
Total=8.00 KB Peak=8.00 KB
EXCHANGE_NODE (id=5): Reservation=0 OtherMemory=0 Total=0 Peak=0
KrpcDeferredRpcs: Total=0 Peak=0
PLAN_ROOT_SINK: Total=0 Peak=0
CodeGen: Total=1.44 KB Peak=423.00 KB
CodeGen: Total=42.90 KB Peak=6.36 MB
I1213 21:16:45.192860 23824 coordinator.cc:959] Backend completed:
host=impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000
remaining=2 query_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.192946 23824 coordinator-backend-state.cc:362]
query_id=0b4f6b681d27f917:a71a3c0b00000000: first in-progress backend:
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27002
I1213 21:16:45.193042 23824 coordinator.cc:712] ExecState: query
id=0b4f6b681d27f917:a71a3c0b00000000
finstance=0b4f6b681d27f917:a71a3c0b00000003 on
host=impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000
(EXECUTING -> ERROR) status=Memory limit exceeded: Could not free memory by
spilling to disk: scratch_limit is 0
Error occurred on backend
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000 by fragment
0b4f6b681d27f917:a71a3c0b00000003
Memory left in process limit: 11.27 GB
Query(0b4f6b681d27f917:a71a3c0b00000000): Reservation=90.00 MB
ReservationLimit=90.00 MB OtherMemory=9.70 MB Total=99.70 MB Peak=99.79 MB
Fragment 0b4f6b681d27f917:a71a3c0b00000003: Reservation=90.00 MB
OtherMemory=9.65 MB Total=99.65 MB Peak=99.74 MB
SORT_NODE (id=3): Total=6.12 MB Peak=8.05 MB
HASH_JOIN_NODE (id=2): Reservation=47.50 MB OtherMemory=83.25 KB Total=47.58
MB Peak=47.58 MB
Exprs: Total=17.12 KB Peak=17.12 KB
Hash Join Builder (join_node_id=2): Total=13.12 KB Peak=21.12 KB
Hash Join Builder (join_node_id=2) Exprs: Total=13.12 KB Peak=13.12 KB
HDFS_SCAN_NODE (id=0): Reservation=42.50 MB OtherMemory=3.44 MB Total=45.94 MB
Peak=47.18 MB
Queued Batches: Total=1.97 MB Peak=3.16 MB
EXCHANGE_NODE (id=4): Reservation=0 OtherMemory=0 Total=0 Peak=7.63 MB
KrpcDeferredRpcs: Total=0 Peak=0
KrpcDataStreamSender (dst_id=5): Total=4.41 KB Peak=4.41 KB
Fragment 0b4f6b681d27f917:a71a3c0b00000000: Reservation=0 OtherMemory=8.00 KB
Total=8.00 KB Peak=8.00 KB
EXCHANGE_NODE (id=5): Reservation=0 OtherMemory=0 Total=0 Peak=0
KrpcDeferredRpcs: Total=0 Peak=0
PLAN_ROOT_SINK: Total=0 Peak=0
CodeGen: Total=1.44 KB Peak=423.00 KB
CodeGen: Total=42.90 KB Peak=6.36 MB
I1213 21:16:45.193156 23824 coordinator-backend-state.cc:921]
query_id=0b4f6b681d27f917:a71a3c0b00000000 target backend=127.0.0.1:27000: Not
cancelling because the backend is already done: Memory limit exceeded: Could
not free memory by spilling to disk: scratch_limit is 0
Error occurred on backend
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000 by fragment
0b4f6b681d27f917:a71a3c0b00000003
Memory left in process limit: 11.27 GB
Query(0b4f6b681d27f917:a71a3c0b00000000): Reservation=90.00 MB
ReservationLimit=90.00 MB OtherMemory=9.70 MB Total=99.70 MB Peak=99.79 MB
Fragment 0b4f6b681d27f917:a71a3c0b00000003: Reservation=90.00 MB
OtherMemory=9.65 MB Total=99.65 MB Peak=99.74 MB
SORT_NODE (id=3): Total=6.12 MB Peak=8.05 MB
HASH_JOIN_NODE (id=2): Reservation=47.50 MB OtherMemory=83.25 KB Total=47.58
MB Peak=47.58 MB
Exprs: Total=17.12 KB Peak=17.12 KB
Hash Join Builder (join_node_id=2): Total=13.12 KB Peak=21.12 KB
Hash Join Builder (join_node_id=2) Exprs: Total=13.12 KB Peak=13.12 KB
HDFS_SCAN_NODE (id=0): Reservation=42.50 MB OtherMemory=3.44 MB Total=45.94 MB
Peak=47.18 MB
Queued Batches: Total=1.97 MB Peak=3.16 MB
EXCHANGE_NODE (id=4): Reservation=0 OtherMemory=0 Total=0 Peak=7.63 MB
KrpcDeferredRpcs: Total=0 Peak=0
KrpcDataStreamSender (dst_id=5): Total=4.41 KB Peak=4.41 KB
Fragment 0b4f6b681d27f917:a71a3c0b00000000: Reservation=0 OtherMemory=8.00 KB
Total=8.00 KB Peak=8.00 KB
EXCHANGE_NODE (id=5): Reservation=0 OtherMemory=0 Total=0 Peak=0
KrpcDeferredRpcs: Total=0 Peak=0
PLAN_ROOT_SINK: Total=0 Peak=0
CodeGen: Total=1.44 KB Peak=423.00 KB
CodeGen: Total=42.90 KB Peak=6.36 MB
I1213 21:16:45.193229 23824 coordinator-backend-state.cc:921]
query_id=0b4f6b681d27f917:a71a3c0b00000000 target backend=127.0.0.1:27002:
Sending CancelQueryFInstances rpc
I1213 21:16:45.194350 23824 coordinator-backend-state.cc:921]
query_id=0b4f6b681d27f917:a71a3c0b00000000 target backend=127.0.0.1:27001: Not
cancelling because the backend is already done:
I1213 21:16:45.194430 23824 coordinator.cc:926] CancelBackends()
query_id=0b4f6b681d27f917:a71a3c0b00000000, tried to cancel 1 backends
I1213 21:16:45.194624 23824 coordinator.cc:1272] Release admission control
resources for query_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.197295 30799 query-state.cc:686]
0b4f6b681d27f917:a71a3c0b00000000] Cancelling fragment instances as directed by
the coordinator. Returned status: Cancelled
I1213 21:16:45.197345 30799 query-state.cc:923]
0b4f6b681d27f917:a71a3c0b00000000] Cancel:
query_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.197432 30799 krpc-data-stream-mgr.cc:337]
0b4f6b681d27f917:a71a3c0b00000000] cancelling active streams for
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.197516 30799 krpc-data-stream-mgr.cc:337]
0b4f6b681d27f917:a71a3c0b00000000] cancelling active streams for
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000003
I1213 21:16:45.198832 30802 krpc-data-stream-mgr.cc:337]
0b4f6b681d27f917:a71a3c0b00000000] cancelling active streams for
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.198951 30802 krpc-data-stream-mgr.cc:306]
0b4f6b681d27f917:a71a3c0b00000000] DeregisterRecvr():
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000000, node=5
I1213 21:16:45.199146 30802 query-state.cc:906]
0b4f6b681d27f917:a71a3c0b00000000] Instance completed.
instance_id=0b4f6b681d27f917:a71a3c0b00000000 #in-flight=4 status=CANCELLED:
Cancelled
I1213 21:16:45.220476 16251 impala-beeswax-server.cc:260] close():
query_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.220543 16251 impala-server.cc:1290] UnregisterQuery():
query_id=0b4f6b681d27f917:a71a3c0b00000000
*Log messages on executor
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27001*
I1213 21:16:24.341915 24136 control-service.cc:142]
0b4f6b681d27f917:a71a3c0b00000000] ExecQueryFInstances():
query_id=0b4f6b681d27f917:a71a3c0b00000000
coord=impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000
#instances=2
I1213 21:16:24.345479 30805 query-state.cc:897]
0b4f6b681d27f917:a71a3c0b00000001] Executing instance.
instance_id=0b4f6b681d27f917:a71a3c0b00000001 fragment_idx=2
per_fragment_instance_idx=0 coord_state_idx=2 #in-flight=3
I1213 21:16:24.345636 30806 query-state.cc:897]
0b4f6b681d27f917:a71a3c0b00000004] Executing instance.
instance_id=0b4f6b681d27f917:a71a3c0b00000004 fragment_idx=1
per_fragment_instance_idx=2 coord_state_idx=2 #in-flight=4
I1213 21:16:24.631848 30805 krpc-data-stream-mgr.cc:337]
0b4f6b681d27f917:a71a3c0b00000001] cancelling active streams for
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000001
I1213 21:16:24.632047 30805 query-state.cc:906]
0b4f6b681d27f917:a71a3c0b00000001] Instance completed.
instance_id=0b4f6b681d27f917:a71a3c0b00000001 #in-flight=3 status=OK
......
1213 21:16:43.751832 2547 krpc-data-stream-mgr.cc:306]
0b4f6b681d27f917:a71a3c0b00000004] DeregisterRecvr():
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000004, node=4
I1213 21:16:44.356305 30806 runtime-state.cc:196]
0b4f6b681d27f917:a71a3c0b00000004] Error from query
0b4f6b681d27f917:a71a3c0b00000000: Memory limit exceeded: Could not free memory
by spilling to disk: scratch_limit is 0
Error occurred on backend
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:*27001* by
fragment 0b4f6b681d27f917:a71a3c0b00000004
Memory left in process limit: 1.80 GB
Query(0b4f6b681d27f917:a71a3c0b00000000): Reservation=90.00 MB
ReservationLimit=90.00 MB OtherMemory=10.88 MB Total=100.88 MB Peak=100.88 MB
Fragment 0b4f6b681d27f917:a71a3c0b00000001: Reservation=0 OtherMemory=0
Total=0 Peak=6.13 MB
HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 Total=0 Peak=6.08 MB
KrpcDataStreamSender (dst_id=4): Total=0 Peak=22.57 KB
Fragment 0b4f6b681d27f917:a71a3c0b00000004: Reservation=90.00 MB
OtherMemory=10.84 MB Total=100.84 MB Peak=100.84 MB
SORT_NODE (id=3): Total=6.12 MB Peak=8.05 MB
HASH_JOIN_NODE (id=2): Reservation=47.50 MB OtherMemory=83.25 KB Total=47.58
MB Peak=47.58 MB
Exprs: Total=17.12 KB Peak=17.12 KB
Hash Join Builder (join_node_id=2): Total=13.12 KB Peak=21.12 KB
Hash Join Builder (join_node_id=2) Exprs: Total=13.12 KB Peak=13.12 KB
HDFS_SCAN_NODE (id=0): Reservation=42.50 MB OtherMemory=4.63 MB Total=47.13 MB
Peak=47.18 MB
Queued Batches: Total=3.16 MB Peak=3.17 MB
EXCHANGE_NODE (id=4): Reservation=0 OtherMemory=0 Total=0 Peak=7.63 MB
KrpcDeferredRpcs: Total=0 Peak=0
KrpcDataStreamSender (dst_id=5): Total=4.41 KB Peak=4.41 KB
CodeGen: Total=852.00 B Peak=52.50 KB
CodeGen: Total=42.90 KB Peak=6.36 MB
I1213 21:16:44.356761 30806 krpc-data-stream-mgr.cc:337]
0b4f6b681d27f917:a71a3c0b00000004] cancelling active streams for
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000004
II1213 21:16:44.381556 30806 query-state.cc:906]
0b4f6b681d27f917:a71a3c0b00000004] Instance completed.
instance_id=0b4f6b681d27f917:a71a3c0b00000004 #in-flight=2
status=MEM_LIMIT_EXCEEDED: Memory limit exceeded: Could not free memory by
spilling to disk: scratch_limit is 0
Error occurred on backend
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27001 by fragment
0b4f6b681d27f917:a71a3c0b00000004
Memory left in process limit: 1.80 GB
Query(0b4f6b681d27f917:a71a3c0b00000000): Reservation=90.00 MB
ReservationLimit=90.00 MB OtherMemory=10.88 MB Total=100.88 MB Peak=100.88 MB
Fragment 0b4f6b681d27f917:a71a3c0b00000001: Reservation=0 OtherMemory=0
Total=0 Peak=6.13 MB
HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 Total=0 Peak=6.08 MB
KrpcDataStreamSender (dst_id=4): Total=0 Peak=22.57 KB
Fragment 0b4f6b681d27f917:a71a3c0b00000004: Reservation=90.00 MB
OtherMemory=10.84 MB Total=100.84 MB Peak=100.84 MB
SORT_NODE (id=3): Total=6.12 MB Peak=8.05 MB
HASH_JOIN_NODE (id=2): Reservation=47.50 MB OtherMemory=83.25 KB Total=47.58
MB Peak=47.58 MB
Exprs: Total=17.12 KB Peak=17.12 KB
Hash Join Builder (join_node_id=2): Total=13.12 KB Peak=21.12 KB
Hash Join Builder (join_node_id=2) Exprs: Total=13.12 KB Peak=13.12 KB
HDFS_SCAN_NODE (id=0): Reservation=42.50 MB OtherMemory=4.63 MB Total=47.13 MB
Peak=47.18 MB
Queued Batches: Total=3.16 MB Peak=3.17 MB
EXCHANGE_NODE (id=4): Reservation=0 OtherMemory=0 Total=0 Peak=7.63 MB
KrpcDeferredRpcs: Total=0 Peak=0
KrpcDataStreamSender (dst_id=5): Total=4.41 KB Peak=4.41 KB
CodeGen: Total=852.00 B Peak=52.50 KB
CodeGen: Total=42.90 KB Peak=6.36 MB
I*1213 21:16:44.396633 30804 query-state.cc:464]
0b4f6b681d27f917:a71a3c0b00000000] UpdateBackendExecState(): last report for
0b4f6b681d27f917:a71a3c0b00000000*
*F1213 21:16:44.399322 30804 query-state.cc:877]
0b4f6b681d27f917:a71a3c0b00000000] Check failed: is_cancelled_.Load() == 1 (0
vs. 1)*
*Log messages on executor
impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27002*
I1213 21:16:24.340821 24179 control-service.cc:142]
0b4f6b681d27f917:a71a3c0b00000000] ExecQueryFInstances():
query_id=0b4f6b681d27f917:a71a3c0b00000000
coord=impala-ec2-centos74-r5-4xlarge-ondemand-053c.vpc.cloudera.com:27000
#instances=1
I1213 21:16:24.343477 30803 query-state.cc:897]
0b4f6b681d27f917:a71a3c0b00000002] Executing instance.
instance_id=0b4f6b681d27f917:a71a3c0b00000002 fragment_idx=1
per_fragment_instance_idx=0 coord_state_idx=1 #in-flight=1
I1213 21:16:33.402086 32272 krpc-data-stream-mgr.cc:306]
0b4f6b681d27f917:a71a3c0b00000002] DeregisterRecvr():
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000002, node=4
I1213 21:16:45.193837 24179 control-service.cc:213] CancelQueryFInstances():
query_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.193897 24179 query-exec-mgr.cc:126] QueryState:
query_id=0b4f6b681d27f917:a71a3c0b00000000 refcnt=3
I1213 21:16:45.193934 24179 query-state.cc:923] Cancel:
query_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.193979 24179 krpc-data-stream-mgr.cc:337] cancelling active
streams for fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000002
I1213 21:16:45.194149 30803 krpc-data-stream-mgr.cc:337]
0b4f6b681d27f917:a71a3c0b00000002] cancelling active streams for
fragment_instance_id=0b4f6b681d27f917:a71a3c0b00000002
I1213 21:16:45.199045 30800 query-state.cc:464]
0b4f6b681d27f917:a71a3c0b00000000] UpdateBackendExecState(): last report for
0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.199033 30803 query-state.cc:906]
0b4f6b681d27f917:a71a3c0b00000002] Instance completed.
instance_id=0b4f6b681d27f917:a71a3c0b00000002 #in-flight=0 status=CANCELLED:
Cancelled
I1213 21:16:45.202457 30800 query-state.cc:686]
0b4f6b681d27f917:a71a3c0b00000000] Cancelling fragment instances as directed by
the coordinator. Returned status: Cancelled
I1213 21:16:45.202517 30800 query-state.cc:923]
0b4f6b681d27f917:a71a3c0b00000000] Cancel:
query_id=0b4f6b681d27f917:a71a3c0b00000000
I1213 21:16:45.214329 30800 query-exec-mgr.cc:213]
0b4f6b681d27f917:a71a3c0b00000000] ReleaseQueryState(): deleted
query_id=0b4f6b681d27f917:a71a3c0b00000000
> Hit DCHECK in TestImpalaShell.test_completed_query_errors_2
> -----------------------------------------------------------
>
> Key: IMPALA-10259
> URL: https://issues.apache.org/jira/browse/IMPALA-10259
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Quanlong Huang
> Assignee: Wenzhe Zhou
> Priority: Blocker
> Labels: broken-build, crash
>
> TestImpalaShell.test_completed_query_errors_2 hits a DCHECK in a core ASAN
> build:
> {code:java}
> F1016 17:08:54.728466 19955 query-state.cc:877]
> 924f4ce603ac07bb:a08656e300000000] Check failed: is_cancelled_.Load() == 1 (0
> vs. 1) {code}
> The test is:
> {code:java}
> shell.test_shell_commandline.TestImpalaShell.test_completed_query_errors_2[table_format_and_file_extension:
> ('textfile', '.txt') | protocol: hs2] {code}
> The query is:
> {code:java}
> I1016 17:08:49.026532 19947 Frontend.java:1522]
> 924f4ce603ac07bb:a08656e300000000] Analyzing query: select id, cnt from
> functional_parquet.bad_column_metadata t, (select 1 cnt) u db: default {code}
> Query options:
> {code:java}
> I1016 17:08:49.020670 19947 impala-hs2-server.cc:269]
> TClientRequest.queryOptions: TQueryOptions {
> 01: abort_on_error (bool) = true,
> 02: max_errors (i32) = 100,
> 03: disable_codegen (bool) = false,
> 04: batch_size (i32) = 0,
> 05: num_nodes (i32) = 0,
> 06: max_scan_range_length (i64) = 0,
> 07: num_scanner_threads (i32) = 0,
> 11: debug_action (string) = "",
> 12: mem_limit (i64) = 0,
> 15: hbase_caching (i32) = 0,
> 16: hbase_cache_blocks (bool) = false,
> 17: parquet_file_size (i64) = 0,
> 18: explain_level (i32) = 1,
> 19: sync_ddl (bool) = false,
> 24: disable_outermost_topn (bool) = false,
> 26: query_timeout_s (i32) = 0,
> 28: appx_count_distinct (bool) = false,
> 29: disable_unsafe_spills (bool) = false,
> 31: exec_single_node_rows_threshold (i32) = 100,
> 32: optimize_partition_key_scans (bool) = false,
> 33: replica_preference (i32) = 0,
> 34: schedule_random_replica (bool) = false,
> 36: disable_streaming_preaggregations (bool) = false,
> 37: runtime_filter_mode (i32) = 2,
> 38: runtime_bloom_filter_size (i32) = 1048576,
> 39: runtime_filter_wait_time_ms (i32) = 0,
> 40: disable_row_runtime_filtering (bool) = false,
> 41: max_num_runtime_filters (i32) = 10,
> 42: parquet_annotate_strings_utf8 (bool) = false,
> 43: parquet_fallback_schema_resolution (i32) = 0,
> 45: s3_skip_insert_staging (bool) = true,
> 46: runtime_filter_min_size (i32) = 1048576,
> 47: runtime_filter_max_size (i32) = 16777216,
> 48: prefetch_mode (i32) = 1,
> 49: strict_mode (bool) = false,
> 50: scratch_limit (i64) = -1,
> 51: enable_expr_rewrites (bool) = true,
> 52: decimal_v2 (bool) = true,
> 53: parquet_dictionary_filtering (bool) = true,
> 54: parquet_array_resolution (i32) = 0,
> 55: parquet_read_statistics (bool) = true,
> 56: default_join_distribution_mode (i32) = 0,
> 57: disable_codegen_rows_threshold (i32) = 50000,
> 58: default_spillable_buffer_size (i64) = 2097152,
> 59: min_spillable_buffer_size (i64) = 65536,
> 60: max_row_size (i64) = 524288,
> 61: idle_session_timeout (i32) = 0,
> 62: compute_stats_min_sample_size (i64) = 1073741824,
> 63: exec_time_limit_s (i32) = 0,
> 64: shuffle_distinct_exprs (bool) = true,
> 65: max_mem_estimate_for_admission (i64) = 0,
> 66: thread_reservation_limit (i32) = 3000,
> 67: thread_reservation_aggregate_limit (i32) = 0,
> 68: kudu_read_mode (i32) = 0,
> 69: allow_erasure_coded_files (bool) = false,
> 70: timezone (string) = "",
> 71: scan_bytes_limit (i64) = 0,
> 72: cpu_limit_s (i64) = 0,
> 73: topn_bytes_limit (i64) = 536870912,
> 74: client_identifier (string) = "Impala Shell v4.0.0-SNAPSHOT (1e30eec)
> built on Fri Oct 16 13:26:18 PDT 2020",
> 75: resource_trace_ratio (double) = 0,
> 76: num_remote_executor_candidates (i32) = 3,
> 77: num_rows_produced_limit (i64) = 0,
> 78: planner_testcase_mode (bool) = false,
> 79: default_file_format (i32) = 0,
> 80: parquet_timestamp_type (i32) = 0,
> 81: parquet_read_page_index (bool) = true,
> 82: parquet_write_page_index (bool) = true,
> 84: disable_hdfs_num_rows_estimate (bool) = false,
> 86: spool_query_results (bool) = false,
> 87: default_transactional_type (i32) = 0,
> 88: statement_expression_limit (i32) = 250000,
> 89: max_statement_length_bytes (i32) = 16777216,
> 90: disable_data_cache (bool) = false,
> 91: max_result_spooling_mem (i64) = 104857600,
> 92: max_spilled_result_spooling_mem (i64) = 1073741824,
> 93: disable_hbase_num_rows_estimate (bool) = false,
> 94: fetch_rows_timeout_ms (i64) = 10000,
> 95: now_string (string) = "",
> 96: parquet_object_store_split_size (i64) = 268435456,
> 97: mem_limit_executors (i64) = 0,
> 98: broadcast_bytes_limit (i64) = 34359738368,
> 99: preagg_bytes_limit (i64) = -1,
> 100: enable_cnf_rewrites (bool) = true,
> 101: max_cnf_exprs (i32) = 0,
> 102: kudu_snapshot_read_timestamp_micros (i64) = 0,
> 103: retry_failed_queries (bool) = false,
> 104: enabled_runtime_filter_types (i32) = 3,
> 105: async_codegen (bool) = false,
> 106: enable_distinct_semi_join_optimization (bool) = true,
> 107: sort_run_bytes_limit (i64) = -1,
> 108: max_fs_writers (i32) = 0,
> 109: refresh_updated_hms_partitions (bool) = false,
> 110: spool_all_results_for_retries (bool) = true,
> 112: use_local_tz_for_unix_timestamp_conversions (bool) = false,
> 113: convert_legacy_hive_parquet_utc_timestamps (bool) = false,
> 114: enable_outer_join_to_inner_transformation (bool) = false,
> } {code}
> Stacktrace:
> {code:java}
> Thread 392 (crashed)
> 0 libc-2.17.so + 0x351f7
> 1 impalad!google::LogMessage::Flush() + 0x1eb
> 2 impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> 3 impalad!impala::QueryState::MonitorFInstances() [query-state.cc : 877 +
> 0x45]
> 4 impalad!impala::QueryExecMgr::ExecuteQueryHelper(impala::QueryState*)
> [query-exec-mgr.cc : 162 + 0x8]
> 5 impalad!boost::_bi::bind_t<void, boost::_mfi::mf1<void,
> impala::QueryExecMgr, impala::QueryState*>,
> boost::_bi::list2<boost::_bi::value<impala::QueryExecMgr*>,
> boost::_bi::value<impala::QueryState*> > >::operator()() [bind.hpp : 1222 +
> 0xe]
> 6 impalad!boost::function0<void>::operator()() const [function_template.hpp
> : 770 + 0x5]
> 7 impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*) [thread.cc : 360 + 0x9]
> 8 impalad!void
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*>
> >::operator()<void (*)(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*),
> boost::_bi::list0>(boost::_bi::type<void>, void
> (*&)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*), boost::_bi::list0&, int) [bind.hpp : 531 + 0x12]
> 9 impalad!boost::_bi::bind_t<void, void
> (*)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*),
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> >
> >::operator()() [bind.hpp : 1222 + 0xe]
> 10 impalad!thread_proxy + 0x72
> 11 libpthread-2.17.so + 0x7e25
> 12 libc-2.17.so + 0xf834d {code}
> This looks like IMPALA-10050 but not sure if it's the same cause.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]