Re: some slots are not be available,when job is not running

2019-08-12 Thread Xintong Song
Hi,

It would be good if you can provide the job manager and task manager log
files, so that others can analysis the problem?

Thank you~

Xintong Song



On Mon, Aug 12, 2019 at 10:12 AM pengcheng...@bonc.com.cn <
pengcheng...@bonc.com.cn> wrote:

> Hi all,
> some slots are not be available,when job is not running.
> I get TM dump when job is not running,and analysis it with *Eclipse
> Memory Analyzer*. Here are some of the results which look useful:
>
>
>- org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread
>@ 0x7f9442c8
>
> Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map ->
> Timestamps/Watermarks -> from: (ORD_ID, O_ORDERKEY_INT, O_ORDERKEY_VARCHAR,
> O_ORDERKEY_LONG, O_ORDERKEY_DOUBLE, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE,
> O_ORDERDATE, O_ORDERPRIORITY, O_CLERK, O_SHIPPRIORITY, O_COMMENT,
> O_TIMESTAMP1, O_TIMESTAMP2, O_DATE, O_TIMESTAMP, O_DATE_EVENT,
> O_TIMESTAMP_EVENT, OVER_TIME) -> where: (AND(LIKE(O_ORDERPRIORITY,
> _UTF-16LE'%M%'), <=(O_CLERK, _UTF-16LE'Clerk#00144'), >=(O_CLERK,
> _UTF-16LE'Clerk#00048'), =(O_ORDERSTATUS, _UTF-16LE'O'))), select:
> (O_CLERK, O_ORDERDATE, CAST(_UTF-16LE'O') AS O_ORDERSTATUS,
> O_ORDERPRIORITY, OVER_TIME, EXTRACT(FLAG(MONTH), O_DATE) AS $f5, O_COMMENT)
> -> time attribute: (OVER_TIME) (2/8) 272 1,281,344 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
> @ 0x7f944200 true
>
>- org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread
>@ 0x7f9442f5a630
>
> Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map ->
> from: (ORD_ID, O_ORDERKEY_INT, O_ORDERKEY_VARCHAR, O_ORDERKEY_LONG,
> O_ORDERKEY_DOUBLE, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE,
> O_ORDERPRIORITY, O_CLERK, O_SHIPPRIORITY, O_COMMENT, O_TIMESTAMP1,
> O_TIMESTAMP2, O_DATE, O_TIMESTAMP, O_DATE_EVENT, O_TIMESTAMP_EVENT,
> OVER_TIME) -> where: (AND(LIKE(O_ORDERPRIORITY, _UTF-16LE'3%'),
> =(O_ORDERSTATUS, _UTF-16LE'F'), <=(O_CLERK, _UTF-16LE'Clerk#06144'),
> >=(O_CLERK, _UTF-16LE'Clerk#00048'))), select: (O_CLERK, O_ORDERDATE,
> CAST(_UTF-16LE'F') AS O_ORDERSTATUS, OVER_TIME, EXTRACT(FLAG(MONTH),
> O_ORDERDATE) AS $f4, O_COMMENT, O_CUSTKEY) (2/8) 272 1,274,312 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
> @ 0x7f9442f5a268 true
>
>- org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread
>@ 0x7f94f4f048a8
>
> Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map ->
> from: (ORD_ID, O_ORDERKEY_INT, O_ORDERKEY_VARCHAR, O_ORDERKEY_LONG,
> O_ORDERKEY_DOUBLE, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE,
> O_ORDERPRIORITY, O_CLERK, O_SHIPPRIORITY, O_COMMENT, O_TIMESTAMP1,
> O_TIMESTAMP2, O_DATE, O_TIMESTAMP, O_DATE_EVENT, O_TIMESTAMP_EVENT,
> OVER_TIME) -> where: (AND(IN(TRIM(FLAG(BOTH), _UTF-16LE' ',
> O_ORDERPRIORITY), _UTF-16LE'1-URGENT', _UTF-16LE'3-MEDIUM',
> _UTF-16LE'5-LOW'), >=(O_CLERK, _UTF-16LE'Clerk#24400'),
> >(EXTRACT(FLAG(MONTH), O_ORDERDATE), 6), <>(O_ORDERSTATUS, _UTF-16LE'O'),
> <=(O_ORDERKEY_INT, 12889))), select: (O_ORDERKEY_INT, O_CUSTKEY,
> O_ORDERSTATUS, O_ORDERDATE, O_CLERK, O_DATE, OVER_TIME) (2/8) 272
> 1,274,184 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
> @ 0x7f94f4f04800 true
>
>- org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread
>@ 0x7f94441a1aa8
>
> Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map ->
> from: (ORD_ID, O_ORDERKEY_INT, O_ORDERKEY_VARCHAR, O_ORDERKEY_LONG,
> O_ORDERKEY_DOUBLE, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE,
> O_ORDERPRIORITY, O_CLERK, O_SHIPPRIORITY, O_COMMENT, O_TIMESTAMP1,
> O_TIMESTAMP2, O_DATE, O_TIMESTAMP, O_DATE_EVENT, O_TIMESTAMP_EVENT,
> OVER_TIME) -> where: (AND(LIKE(O_ORDERPRIORITY, _UTF-16LE'%M%'),
> <=(O_CLERK, _UTF-16LE'Clerk#00144'), >=(O_CLERK,
> _UTF-16LE'Clerk#00048'), =(O_ORDERSTATUS, _UTF-16LE'O'))), select:
> (O_CLERK, O_ORDERDATE, CAST(_UTF-16LE'O') AS O_ORDERSTATUS,
> O_ORDERPRIORITY, CEIL(MOD(O_CUSTKEY, EXTRACT(FLAG(MONTH), O_ORDERDATE))) AS
> $f4, OVER_TIME, EXTRACT(FLAG(MONTH), O_DATE) AS $f6, O_COMMENT) (2/8) 272
> 1,263,416 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
> @ 0x7f94441a1a00 true
>
>- org.apache.kafka.common.utils.KafkaThread @ 0x7f91de001b78 »
>
> kafka-producer-network-thread | producer-1 184 342,912 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
> @ 0x7f91de40 true
>
>- org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread
>@ 0x7f947a57d290
>
> Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map ->
> from: (ORD_ID, PS_PARTKEY, PS_SUPPKEY, PS_AVAILQTY, PS_SUPPLYCOST,
> PS_COMMENT, PS_INT, PS_LONG, PS_DOUBLE8, PS_DOUBLE14, PS_DOUBLE15,
> PS_NUMBER1, PS_NUMBER2, 

some slots are not be available,when job is not running

2019-08-12 Thread pengcheng...@bonc.com.cn
Hi all,
some slots are not be available,when job is not running.
I get TM dump when job is not running,and analysis it with Eclipse Memory 
Analyzer. Here are some of the results which look useful:

org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread @ 
0x7f9442c8
Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map -> 
Timestamps/Watermarks -> from: (ORD_ID, O_ORDERKEY_INT, O_ORDERKEY_VARCHAR, 
O_ORDERKEY_LONG, O_ORDERKEY_DOUBLE, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, 
O_ORDERDATE, O_ORDERPRIORITY, O_CLERK, O_SHIPPRIORITY, O_COMMENT, O_TIMESTAMP1, 
O_TIMESTAMP2, O_DATE, O_TIMESTAMP, O_DATE_EVENT, O_TIMESTAMP_EVENT, OVER_TIME) 
-> where: (AND(LIKE(O_ORDERPRIORITY, _UTF-16LE'%M%'), <=(O_CLERK, 
_UTF-16LE'Clerk#00144'), >=(O_CLERK, _UTF-16LE'Clerk#00048'), 
=(O_ORDERSTATUS, _UTF-16LE'O'))), select: (O_CLERK, O_ORDERDATE, 
CAST(_UTF-16LE'O') AS O_ORDERSTATUS, O_ORDERPRIORITY, OVER_TIME, 
EXTRACT(FLAG(MONTH), O_DATE) AS $f5, O_COMMENT) -> time attribute: (OVER_TIME) 
(2/8)2721,281,344org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
 @ 0x7f944200true
org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread @ 
0x7f9442f5a630
Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map -> from: 
(ORD_ID, O_ORDERKEY_INT, O_ORDERKEY_VARCHAR, O_ORDERKEY_LONG, 
O_ORDERKEY_DOUBLE, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE, 
O_ORDERPRIORITY, O_CLERK, O_SHIPPRIORITY, O_COMMENT, O_TIMESTAMP1, 
O_TIMESTAMP2, O_DATE, O_TIMESTAMP, O_DATE_EVENT, O_TIMESTAMP_EVENT, OVER_TIME) 
-> where: (AND(LIKE(O_ORDERPRIORITY, _UTF-16LE'3%'), =(O_ORDERSTATUS, 
_UTF-16LE'F'), <=(O_CLERK, _UTF-16LE'Clerk#06144'), >=(O_CLERK, 
_UTF-16LE'Clerk#00048'))), select: (O_CLERK, O_ORDERDATE, 
CAST(_UTF-16LE'F') AS O_ORDERSTATUS, OVER_TIME, EXTRACT(FLAG(MONTH), 
O_ORDERDATE) AS $f4, O_COMMENT, O_CUSTKEY) 
(2/8)2721,274,312org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
 @ 0x7f9442f5a268true
org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread @ 
0x7f94f4f048a8
Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map -> from: 
(ORD_ID, O_ORDERKEY_INT, O_ORDERKEY_VARCHAR, O_ORDERKEY_LONG, 
O_ORDERKEY_DOUBLE, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE, 
O_ORDERPRIORITY, O_CLERK, O_SHIPPRIORITY, O_COMMENT, O_TIMESTAMP1, 
O_TIMESTAMP2, O_DATE, O_TIMESTAMP, O_DATE_EVENT, O_TIMESTAMP_EVENT, OVER_TIME) 
-> where: (AND(IN(TRIM(FLAG(BOTH), _UTF-16LE' ', O_ORDERPRIORITY), 
_UTF-16LE'1-URGENT', _UTF-16LE'3-MEDIUM', _UTF-16LE'5-LOW'), >=(O_CLERK, 
_UTF-16LE'Clerk#24400'), >(EXTRACT(FLAG(MONTH), O_ORDERDATE), 6), 
<>(O_ORDERSTATUS, _UTF-16LE'O'), <=(O_ORDERKEY_INT, 12889))), select: 
(O_ORDERKEY_INT, O_CUSTKEY, O_ORDERSTATUS, O_ORDERDATE, O_CLERK, O_DATE, 
OVER_TIME) 
(2/8)2721,274,184org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
 @ 0x7f94f4f04800true
org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread @ 
0x7f94441a1aa8
Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map -> from: 
(ORD_ID, O_ORDERKEY_INT, O_ORDERKEY_VARCHAR, O_ORDERKEY_LONG, 
O_ORDERKEY_DOUBLE, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE, 
O_ORDERPRIORITY, O_CLERK, O_SHIPPRIORITY, O_COMMENT, O_TIMESTAMP1, 
O_TIMESTAMP2, O_DATE, O_TIMESTAMP, O_DATE_EVENT, O_TIMESTAMP_EVENT, OVER_TIME) 
-> where: (AND(LIKE(O_ORDERPRIORITY, _UTF-16LE'%M%'), <=(O_CLERK, 
_UTF-16LE'Clerk#00144'), >=(O_CLERK, _UTF-16LE'Clerk#00048'), 
=(O_ORDERSTATUS, _UTF-16LE'O'))), select: (O_CLERK, O_ORDERDATE, 
CAST(_UTF-16LE'O') AS O_ORDERSTATUS, O_ORDERPRIORITY, CEIL(MOD(O_CUSTKEY, 
EXTRACT(FLAG(MONTH), O_ORDERDATE))) AS $f4, OVER_TIME, EXTRACT(FLAG(MONTH), 
O_DATE) AS $f6, O_COMMENT) 
(2/8)2721,263,416org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
 @ 0x7f94441a1a00true
org.apache.kafka.common.utils.KafkaThread @ 0x7f91de001b78 »
kafka-producer-network-thread | 
producer-1184342,912org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
 @ 0x7f91de40true
org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread @ 
0x7f947a57d290
Kafka 0.10 Fetcher for Source: Custom Source -> Map -> Filter -> Map -> from: 
(ORD_ID, PS_PARTKEY, PS_SUPPKEY, PS_AVAILQTY, PS_SUPPLYCOST, PS_COMMENT, 
PS_INT, PS_LONG, PS_DOUBLE8, PS_DOUBLE14, PS_DOUBLE15, PS_NUMBER1, PS_NUMBER2, 
PS_NUMBER3, PS_NUMBER4, PS_DATE, PS_TIMESTAMP, PS_DATE_EVENT, 
PS_TIMESTAMP_EVENT, OVER_TIME) -> select: (PS_PARTKEY, PS_SUPPKEY, PS_AVAILQTY, 
PS_COMMENT, OVER_TIME) 
(1/8)272243,512org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader
 @ 0x7f947a57c6d0true
akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThread @ 
0x7f91e0da4c40 »

Re: some slots are not be available,when job is not running

2019-08-09 Thread Zhu Zhu
Hi pengchengling,

Does this issue happen before you submitting any job to the cluster or
after some jobs are terminated?
If it's the latter case, didi you wait for a while to see if the
unavailable slots became available again?

Thanks,
Zhu Zhu

pengcheng...@bonc.com.cn  于2019年8月9日周五 下午4:55写道:

> Hi,
>
> Why are some slots unavailable?
>
> My cluster model is standalone,and high-availability mode is zookeeper.
> task.cancellation.timeout: 0
> some slots are not be available,when job is not running.
>
>
>
> --
> pengcheng...@bonc.com.cn
>


Re: some slots are not be available,when job is not running

2019-08-09 Thread Zili Chen
Hi,

Could you attach the stack trace in exception or relevant logs?

Best,
tison.


pengcheng...@bonc.com.cn  于2019年8月9日周五 下午4:55写道:

> Hi,
>
> Why are some slots unavailable?
>
> My cluster model is standalone,and high-availability mode is zookeeper.
> task.cancellation.timeout: 0
> some slots are not be available,when job is not running.
>
>
>
> --
> pengcheng...@bonc.com.cn
>


some slots are not be available,when job is not running

2019-08-09 Thread pengcheng...@bonc.com.cn
Hi,

Why are some slots unavailable?

My cluster model is standalone,and high-availability mode is zookeeper. 
task.cancellation.timeout: 0
some slots are not be available,when job is not running.





pengcheng...@bonc.com.cn