Re: Re:Re: The wrong Options of Kafka Connector, will make the cluster can not run any job

2021-04-27 Thread cxydevelop
oh, I am wrong again, the last  it is in flink_1.12.2 not flink_1.11.2



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re:Re: The wrong Options of Kafka Connector, will make the cluster can not run any job

2021-04-27 Thread chenxuying
I had tested flink job in flink_1.11.2 and flink_1.12.2. The error log I post 
before is in flink_1.11.2 cluster.

Now I run job in flink_1.11.2.




1. The wrong Options of Kafka Connector

Ip is right, port is wrong,

```

CREATE TABLE KafkaTable (

message STRING

) WITH (

'connector' = 'kafka',

'topic' = 'filebeat_json_install_log',

'properties.bootstrap.servers' = '192.168.0.77:9093',

'properties.group.id' = 'testGroup',

'scan.startup.mode' = 'latest-offset',

'format' = 'json'

);

```




2. Job details In flink web UI

Log in Root Exception Tabs, as below:

```

2021-04-27 15:59:11

org.apache.kafka.common.errors.TimeoutException: Timeout of 6ms expired 
before the position for partition filebeat_json_install_log-3 could be 
determined

```




3. Logs in Job Manager

Job Manager print logs continuously as below:

```

org.apache.kafka.common.errors.TimeoutException: Timeout of 6ms expired 
before the position for partition filebeat_json_install_log-3 could be 
determined

2021-04-27 08:03:16,162 INFO  
org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy
 [] - Calculating tasks to restart to recover the failed task 
cbc357ccb763df2852fee8c4fc7d55f2_0.

2021-04-27 08:03:16,163 INFO  
org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy
 [] - 1 tasks should be restarted to recover the failed task 
cbc357ccb763df2852fee8c4fc7d55f2_0. 

2021-04-27 08:03:16,163 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job 
v2_ods_device_action_log (876dbcddcec696d42ed887512dacdf22) switched from state 
RUNNING to RESTARTING.

2021-04-27 08:03:17,163 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job 
v2_ods_device_action_log (876dbcddcec696d42ed887512dacdf22) switched from state 
RESTARTING to RUNNING.

2021-04-27 08:03:17,164 INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Restoring job 
876dbcddcec696d42ed887512dacdf22 from Checkpoint 6 @ 1619510548493 for 
876dbcddcec696d42ed887512dacdf22 located at 
oss://tanwan-datahub/test/flinksql/checkpoints/876dbcddcec696d42ed887512dacdf22/chk-6.

2021-04-27 08:03:17,165 INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - No master 
state to restore

2021-04-27 08:03:17,165 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Source: 
time, pk_id, key_id, idfv, media_site_id]) (1/1) 
(278dd023107c2fd3f2b42383e0c01794) switched from CREATED to SCHEDULED.

2021-04-27 08:03:17,165 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Source: ...) 
(1/1) (278dd023107c2fd3f2b42383e0c01794) switched from SCHEDULED to DEPLOYING.

2021-04-27 08:03:17,166 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Deploying 
Source: TableSourceScan(tac2fd3f2b42383e0c01794 to 192.168.3.64:6122-55a668 
@ 192.168.3.64 (dataPort=34077) with allocation id 
091b8c459bd00a2deaea398a41c831ab

2021-04-27 08:03:17,176 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Source: 
TableSourceScan(table=[[d...3e0c01794) switched from DEPLOYING to RUNNING.

```




3. Cancel job

When I cancel the job ,Job Manager print logs as below:

```

2021-04-27 08:11:18,190 INFO  
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job 
876dbcddcec696d42ed887512dacdf22 reached globally terminal state CANCELED.

2021-04-27 08:11:18,196 INFO  org.apache.flink.runtime.jobmaster.JobMaster  
   [] - Stopping the JobMaster for job 
v2_ods_device_action_log(876dbcddcec696d42ed887512dacdf22).

2021-04-27 08:11:18,197 INFO  
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] - Suspending 
SlotPool.

2021-04-27 08:11:18,197 INFO  org.apache.flink.runtime.jobmaster.JobMaster  
   [] - Close ResourceManager connection 
65303b0e98faaa00ada09ad7271be558: Stopping JobMaster for job 
v2_ods_device_action_log(876dbcddcec696d42ed887512dacdf22)..

2021-04-27 08:11:18,197 INFO  
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] - Stopping 
SlotPool.

2021-04-27 08:11:18,197 INFO  
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - 
Disconnect job manager 
0...@akka.tcp://flink@flink-jobmanager:6123/user/rpc/jobmanager_3
 for job 876dbcddcec696d42ed887512dacdf22 from the resource manager.

2021-04-27 08:11:18,216 WARN  
org.apache.flink.runtime.taskmanager.TaskManagerLocation [] - No hostname 
could be resolved for the IP address 192.168.3.64, using IP address as host 
name. Local input split assignment (such as for HDFS files) may be impacted.

2021-04-27 08:11:18,271 INFO  
org.apache.flink.fs.osshadoop.shaded.com.aliyun.oss  [] - 
[Server]Unable to execute HTTP request: Not Found

[ErrorCode]: NoSuchKey

[RequestId]: 6087C726766D47343487BE32

[HostId]: null

2021-04-27 08:11:18,275 INFO  
org.apach