Flink 1.12 Sql设置job name

2020-12-30 文章 HideOnBushKi
Hi 大佬们:
请教两个问题

tEnv.executeSql("sql1")
tEnv.executeSql("sql2")

1.executeSql中,两个作业似乎被分配到了不同的yarn appId中运行,如果这两个作业共同依赖一个kakfa
table,这会影响到消费位置吗?还是说  这两个appID其实是一个作业

2.怎么在1.12 SQL中设置 job name呢



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Flink 1.12 Sql设置job name

2020-12-30 文章 HideOnBushKi
Hi 大佬们:
请问怎么在1.12Flink sql 中设置job name呢



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: SQL Client并行度设置 问题

2020-12-30 文章 Jark Wu
在 Batch 模式下:

1. Hive source 会去推断并发数,并发数由文件数决定。你也可以通过
table.exec.hive.infer-source-parallelism=false 来禁止并发推断,
 这时候就会用 job 并发。或者设置一个最大的推断并发数
table.exec.hive.infer-source-parallelism.max。[1]
2. 同上。
3. 这里跟 max-parallelism 应该没有关系,应该是你没有配置 max slot 的原因,source 申请的并发太多,而 yarn
一时半会儿没这么多资源,所以超时了。
   配上 slotmanager.number-of-slots.max 就可以防止 batch 作业无限制地去申请资源。

Best,
Jark

[1]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/hive/hive_read_write.html#source-parallelism-inference
[2]:
https://ci.apache.org/projects/flink/flink-docs-master/deployment/config.html#slotmanager-number-of-slots-max

On Thu, 31 Dec 2020 at 14:56, jiangjiguang719 
wrote:

> flink1.12版本,使用SQL Client提交任务,读hive表,对并行度有些疑问,以下是现象:
> flink-conf.yaml文件中的:
>   taskmanager.numberOfTaskSlots: 1   有效
>   parallelism.default: 1无效,实际任务的并行度=hive表的文件 且 <= 160
> sql-client-defaults.yaml 文件中的:
>   execution:
> parallelism: 10无效
> max-parallelism: 16 当hive表的文件数大于此值时,报资源不足  Deployment took more
> than 60 seconds. Please check if the requested resources are available in
> the YARN cluster
> 问题:
> 1、SQL Client提交任务 怎么设置并行度?
> 2、为啥parallelism参数是无效的?
> 3、当hive表文件数大于max-parallelism 时为啥 发布失败?


SQL Client并行度设置 问题

2020-12-30 文章 jiangjiguang719
flink1.12版本,使用SQL Client提交任务,读hive表,对并行度有些疑问,以下是现象:
flink-conf.yaml文件中的:
  taskmanager.numberOfTaskSlots: 1   有效
  parallelism.default: 1无效,实际任务的并行度=hive表的文件 且 <= 160
sql-client-defaults.yaml 文件中的:
  execution:
parallelism: 10无效
max-parallelism: 16 当hive表的文件数大于此值时,报资源不足  Deployment took more than 60 
seconds. Please check if the requested resources are available in the YARN 
cluster
问题:
1、SQL Client提交任务 怎么设置并行度?
2、为啥parallelism参数是无效的?
3、当hive表文件数大于max-parallelism 时为啥 发布失败?

Flink sql执行insert into 的一些问题

2020-12-30 文章 Jacob
Dear All,

Flink SQL> insert into table1 select filed1,filed2,.filed10 from table2;

在flink sql
中执行类似上面的语句,在webui中看到很快就finished了,但数据并没有写进table1表中,查看log,也看不到什么报错。迷惑


还有,在使用select count(*) 查询表数据时,有的表能查到结果,有的就不显示结果,也没有报错。实在不知道什么原因了。。。



-
Thanks!
Jacob
--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: 关于先基于process_time预聚合,再基于event_time聚合的问题。

2020-12-30 文章 赵一旦
这个问题基本分析应该没啥问题,发出来给大家参考借鉴。

赵一旦  于2020年12月31日周四 下午1:01写道:

> 目的呢如题:先基于process_time预聚合,最后基于event_time聚合。
>
> 预聚合使用10s窗口,最终聚合使用5min窗口,且使用10s的continuous trigger。
>
>
> 同时,为了避免2个5分钟窗口的数据在窗口临界位置时候,被10s的预聚合到一起(错误case),我在预聚合的时候使用的key中多加了个字段(time),time是格式化到5分钟的结尾时间的time。因此这个问题可以忽略。
>
> 但是呢,目前发现一个更大的问题。最终窗口输出的key+time的pv存在变小的情况。刚开始很奇怪,想了很久。然后分析出一些问题。
> 实际key+time1=2: 1000,变为key+time1:
> 50的情况下。这个key+time1的50pv实际由2个窗口的数据组成,一部分是time1窗口,一部分time2窗口。但是reduce复用了value1因此最终输出的time为time1。
> 那么为什么pv是50呢。因为time2哪部分pv本身很小。time1那部分pv是在预聚合的最后某10s的数据(仅10s的数据)。
>
> 当然,这个更奇怪的地方是,这2部分pv为什么会被聚合到一起进而被输出。这个才是关键。经过分析,最终想到一个细节,之前忽略了同时也想当然了。window输出的时候设置最大时间戳这个,想当然的认为只针对event_time,没想到还针对process_time,这导致time1窗口中最后几秒的数据实际处理时候,肯定已经处于time2时间了(毕竟数据流肯定是延迟的,统计时的时间肯定大于数据的event时间),因此预聚合输出的数据带了process_time的window的maxTs作为输出元素的时间戳。这按照源码逻辑就是这样的。但是我后边继续将这个时间当作了event_time使用。这个让我很难受。
>
>
>
> 其实感觉也有点奇怪,flink这个机制为什么不仅针对event_time,还针对process_time。真么一搞会导致2个时间无法在同一个任务流中使用。。。
>
>
> 当然,目前我想着解决方法也是有的,那就是在预聚合窗口之后reAssignTimestampAndWatermark。
>
>
>
> 当然还有一种是将预聚合窗口也使用event_time,但是我这个任务的key很少,对准确性要求很高,我设置了maxOutOfOrderness为1整天(为了数据流异常后续补数据的时候任务可以正常处理这部分数据),如果都使用event_time窗口,会导致窗口的状态数据翻2倍。使用处理时间的话,预聚合窗口的状态几乎就可以忽略了。
>
>
>
>
>
>


关于先基于process_time预聚合,再基于event_time聚合的问题。

2020-12-30 文章 赵一旦
目的呢如题:先基于process_time预聚合,最后基于event_time聚合。

预聚合使用10s窗口,最终聚合使用5min窗口,且使用10s的continuous trigger。

同时,为了避免2个5分钟窗口的数据在窗口临界位置时候,被10s的预聚合到一起(错误case),我在预聚合的时候使用的key中多加了个字段(time),time是格式化到5分钟的结尾时间的time。因此这个问题可以忽略。

但是呢,目前发现一个更大的问题。最终窗口输出的key+time的pv存在变小的情况。刚开始很奇怪,想了很久。然后分析出一些问题。
实际key+time1=2: 1000,变为key+time1:
50的情况下。这个key+time1的50pv实际由2个窗口的数据组成,一部分是time1窗口,一部分time2窗口。但是reduce复用了value1因此最终输出的time为time1。
那么为什么pv是50呢。因为time2哪部分pv本身很小。time1那部分pv是在预聚合的最后某10s的数据(仅10s的数据)。
当然,这个更奇怪的地方是,这2部分pv为什么会被聚合到一起进而被输出。这个才是关键。经过分析,最终想到一个细节,之前忽略了同时也想当然了。window输出的时候设置最大时间戳这个,想当然的认为只针对event_time,没想到还针对process_time,这导致time1窗口中最后几秒的数据实际处理时候,肯定已经处于time2时间了(毕竟数据流肯定是延迟的,统计时的时间肯定大于数据的event时间),因此预聚合输出的数据带了process_time的window的maxTs作为输出元素的时间戳。这按照源码逻辑就是这样的。但是我后边继续将这个时间当作了event_time使用。这个让我很难受。


其实感觉也有点奇怪,flink这个机制为什么不仅针对event_time,还针对process_time。真么一搞会导致2个时间无法在同一个任务流中使用。。。


当然,目前我想着解决方法也是有的,那就是在预聚合窗口之后reAssignTimestampAndWatermark。


当然还有一种是将预聚合窗口也使用event_time,但是我这个任务的key很少,对准确性要求很高,我设置了maxOutOfOrderness为1整天(为了数据流异常后续补数据的时候任务可以正常处理这部分数据),如果都使用event_time窗口,会导致窗口的状态数据翻2倍。使用处理时间的话,预聚合窗口的状态几乎就可以忽略了。


Re: flink1.12错误OSError: Expected IPC message of type schema but got record batch

2020-12-30 文章 咿咿呀呀
up



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: flink-cdc 无法读出binlog,程序也不报错

2020-12-30 文章 chenjb
老哥,抱歉,后面忘了看这里的消息了,我当时在本地用idea调试的,也没开webui,所以也没看TaskManager的日志,确实在idea里面是没报错就结束了。后面把字段类型按官网的严格对应起来就没这个问题了。多谢老哥回复



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Flink cdc connector:数据量较大时,snapshot阶段报错

2020-12-30 文章 chenjb
老哥,我碰到了一样的问题,有找到原因吗?



--
Sent from: http://apache-flink.147419.n8.nabble.com/


flink-cdc 简单聚合后再次通过jdbc-connector sink到mysql,SnapshotReader出现报错

2020-12-30 文章 chenjb


老哥们好,麻烦帮忙看看,谢谢


场景是:
source是用cdc从mysql读取数据(大概400多万条),用sql写的一个简单group
by场景,然后通过jdbc-connector写到mysql

配置了checkpoint(5min一次)提交到flink
standalone集群上运行,发现运行不久后好像数据流就停了,大概只处理了几万条数据,但任务依然是运行状态

运行过程中sink很慢(大概算了下每秒17条记录),导致source的反压很高

插入taskmanager日志发现有如下报错

2020-12-30 14:31:38,723 INFO  io.debezium.connector.mysql.SnapshotReader
  
[] - Step 8: committing transaction
2020-12-30 14:31:38,723 ERROR io.debezium.connector.mysql.SnapshotReader
  
[] - Failed due to error: Aborting snapshot due to error when last running
'SELECT * FROM `ydy_hsapdb`.`alfk_ydy_qyw_user_detail`': Streaming result
set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@2c66154c is still
active. No statements may be issued when any streaming result sets are open
and in use on a given connection. Ensure that you have called .close() on
any active streaming result sets before attempting more queries.
org.apache.kafka.connect.errors.ConnectException: Streaming result set
com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@2c66154c is still
active. No statements may be issued when any streaming result sets are open
and in use on a given connection. Ensure that you have called .close() on
any active streaming result sets before attempting more queries. Error code:
0; SQLSTATE: S1000.
at 
io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:?]
at
io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:207)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:?]
at
io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:831)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_201]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
Caused by: java.sql.SQLException: Streaming result set
com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@2c66154c is still
active. No statements may be issued when any streaming result sets are open
and in use on a given connection. Ensure that you have called .close() on
any active streaming result sets before attempting more queries.
at
com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at
com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at
com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at com.mysql.cj.jdbc.ConnectionImpl.commit(ConnectionImpl.java:813)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at
io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:747)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:?]
... 3 more
2020-12-30 14:31:38,729 WARN  io.debezium.connector.mysql.SnapshotReader
  
[] - Failed to close the connection properly
java.sql.SQLException: Streaming result set
com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@2c66154c is still
active. No statements may be issued when any streaming result sets are open
and in use on a given connection. Ensure that you have called .close() on
any active streaming result sets before attempting more queries.
at
com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at
com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at
com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at
com.mysql.cj.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:1961)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at com.mysql.cj.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:1855)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at com.mysql.cj.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:1720)
~[blob_p-84a0db5932f0f8db241fae3c0801c7e31718c243-c15224c86e25f0dfba1c2b8da96760fb:1.1.0]
at com.mysql.cj.jdbc.ConnectionImpl.close(ConnectionImpl.java:720)

Re: flink如何使用oss作为checkpoint/savepoint/statebackend?

2020-12-30 文章 Yun Tang
Hi

其实社区文档 [1] 已经给了很详细的步骤:

  1.  将flink-oss-fs-hadoop jar包放在plugins目录下
  2.  配置oss的endpoint,id和secret
  3.  在需要使用oss的地方,声明oss:// 开头的schema,例如state backend创建的时候

[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/deployment/filesystems/oss.html

祝好
唐云

From: 陈帅 
Sent: Wednesday, December 30, 2020 20:53
To: user-zh@flink.apache.org 
Subject: flink如何使用oss作为checkpoint/savepoint/statebackend?

请问flink如何使用oss作为checkpoint/savepoint/statebackend? 需要依赖Hadoop并配置Hadoop on OSS吗?


Re:Re: flink 1.12.0 kubernetes-session部署问题

2020-12-30 文章 陈帅
我是在MacBook Pro上搭建了一套MiniKube,安装了VirtualBox。请问正确启动 Flink v1.11.3 on K8S 的步骤是怎样的?
我实践的步骤是:


minikube start
cd /Users/admin/dev/flink-1.11.3
./bin/kubernetes-session.sh
此时显示拉取的镜像名称是 flink:1.11.3-scala_2.12 ,而不是dockerhub仓库上flink官方给的 
flink:1.11.3-scala_2.12-java8
于是我重新使用命令
./bin/kubernetes-session.sh \
  -Dkubernetes.cluster-id=my-flink-cluster \
  -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8


等待一段拉取镜像时间后get pod显示



SJ-DN0393:flink-1.11.3 admin$ kubectl get pods 

NAME   READY   STATUS 
RESTARTS   AGE

kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1 Running3 
 10d

my-flink-cluster-77c6f85879-9vcx8  0/1 CrashLoopBackOff   5 
 29m




通过describe pod命令显示




Events:

  Type Reason   AgeFrom   Message

   --         ---

  Normal   Scheduled29mdefault-scheduler  Successfully 
assigned default/my-flink-cluster-77c6f85879-9vcx8 to minikube

  Warning  FailedMount  29mkubelet
MountVolume.SetUp failed for volume "flink-config-volume" : configmap 
"flink-config-my-flink-cluster" not found

  Warning  FailedMount  29mkubelet
MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap 
"hadoop-config-my-flink-cluster" not found

  Normal   Pulling  29mkubeletPulling image 
"flink:1.11.3-scala_2.12-java8"

  Normal   Pulled   2m41s (x5 over 4m34s)  kubeletContainer 
image "flink:1.11.3-scala_2.12-java8" already present on machine

  Normal   Created  2m41s (x5 over 4m33s)  kubeletCreated 
container flink-job-manager

  Normal   Started  2m41s (x5 over 4m33s)  kubeletStarted 
container flink-job-manager

  Warning  BackOff  2m8s (x10 over 4m18s)  kubeletBack-off 
restarting failed container




















在 2020-12-28 10:40:59,"Yang Wang"  写道:
>你整个流程理由有两个问题:
>
>1. 镜像找不到
>原因应该是和minikube的driver设置有关,如果是hyperkit或者其他vm的方式,你需要minikube
>ssh到虚拟机内部查看镜像是否正常存在
>
>2. JM链接无法访问
>2020-12-27 22:08:12,387 INFO
>org.apache.flink.kubernetes.KubernetesClusterDescriptor  [] - Create
>flink session cluster session001 successfully, JobManager Web Interface:
>http://192.168.99.100:8081
>
>我猜你上面的这行log应该不是你贴出来的命令打印的,因为你给的命令是NodePort方式,打印出来的JM地址不应该是8081端口的。
>只要你在minikube上提交的任务加上kubernetes.rest-service.exposed.type=NodePort,并且JM能起来,打印出来的JM地址就是可以访问的
>
>当然你也可以手动拼接出来这个链接,minikube ip拿到APIServer地址,然后用kubectl get svc 去查看你创建的Flink
>Session Cluster对应的rest svc的NodePort,拼起来访问就好了
>
>
>Best,
>Yang
>
>陈帅  于2020年12月27日周日 下午10:51写道:
>
>>
>> 本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
>>
>>
>> git clone
>> https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
>> docker build --tag flink:1.12.0-scala_2.12-java8 .
>>
>>
>> cd flink-1.12.0
>> ./bin/kubernetes-session.sh \
>> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \
>> -Dkubernetes.rest-service.exposed.type=NodePort \
>> -Dtaskmanager.numberOfTaskSlots=2 \
>> -Dkubernetes.cluster-id=flink-session-cluster
>>
>>
>> 显示JM启起来了,但无法通过web访问
>>
>> 2020-12-27 22:08:12,387 INFO
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor  [] - Create
>> flink session cluster session001 successfully, JobManager Web Interface:
>> http://192.168.99.100:8081
>>
>>
>>
>>
>> 通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
>>
>> NAME   READY   STATUS
>> RESTARTS   AGE
>>
>> flink-session-cluster-858bd55dff-bzjk2 0/1
>>  ContainerCreating   0  5m59s
>>
>> kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1 Running
>>  0  6d14h
>>
>>
>>
>>
>> 于是通过 `kubectl describe pod
>> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
>>
>>
>>
>>
>> Name: flink-session-cluster-858bd55dff-bzjk2
>>
>> Namespace:default
>>
>> Priority: 0
>>
>> Node: minikube/192.168.99.100
>>
>> Start Time:   Sun, 27 Dec 2020 22:21:56 +0800
>>
>> Labels:   app=flink-session-cluster
>>
>>   component=jobmanager
>>
>>   pod-template-hash=858bd55dff
>>
>>   type=flink-native-kubernetes
>>
>> Annotations:  
>>
>> Status:   Pending
>>
>> IP:   172.17.0.4
>>
>> IPs:
>>
>>   IP:   172.17.0.4
>>
>> Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff
>>
>> Containers:
>>
>>   flink-job-manager:
>>
>> Container ID:
>>
>> Image: flink:1.12.0-scala_2.12-java8
>>
>> Image ID:
>>
>> Ports: 8081/TCP, 6123/TCP, 6124/TCP
>>
>> Host Ports:0/TCP, 0/TCP, 0/TCP
>>
>> Command:
>>
>>   /docker-entrypoint.sh
>>
>> Args:
>>
>>   native-k8s
>>
>>   $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
>> -Xms1073741824 

Re: FlinkSQL是否支持设置窗口trigger实现continuious trigger呢

2020-12-30 文章 Sebastian Liu
table.exec.emit.early-fire.delay 是一个duration type的configuration,不写单位会apply失败

fan_future  于2020年12月30日周三 下午4:00写道:

> 这两个参数
> table.exec.emit.early-fire.enabled
> table.exec.emit.early-fire.delay
> 是怎么设置的??
>
> EnvironmentSettings build =
>
> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build();
> TableEnvironment tEnv = TableEnvironment.create(build);
>
> Configuration tableConfig = tEnv.getConfig().getConfiguration();
> tableConfig.setString("table.exec.emit.early-fire.enabled","true");
> tableConfig.setString("table.exec.emit.early-fire.delay","6");
> 我这样处理后就报错
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
>


-- 

*With kind regards

Sebastian Liu 刘洋
Institute of Computing Technology, Chinese Academy of Science
Mobile\WeChat: +86—15201613655
E-mail: liuyang0...@gmail.com 
QQ: 3239559*


flink on yarn??????????????

2020-12-30 文章 ??????
??
bin/flink run -m yarn-cluster -yjm 1024 -ytm 1024 -c 
com.dwd.lunch.dwd_lunch.Dim_Cc_Media -ys 1 xjia_shuyun-6.0.jar

nodemanager??


Deployment took more than 60 seconds. Please check if the requested resources 
are available in the YARN cluster
??
Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: 
The YARN application unexpectedly switched to state KILLED during 
deployment.
Diagnostics from YARN: Application application_1607073161732_0137 was killed by 
user xjia at 192.168.100.31

Re: flink 1.12.0 + hive 3.1.2 报错 java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument

2020-12-30 文章 Sebastian Liu
Hi Jianqiang,

看不到email中的截图。但从你的描述中看,似乎是shade jar的问题,Flink SQL
client启动的时候会自动使用FLINK_HOME/bin/config.sh中的constructFlinkClassPath,
并append到当前SQL client JVM的CC_CLASSPATH中。所以理论上不需要向Flink
lib目录cp其他jar,当然如果有特殊依赖jar,还是需要保证在CLASSPATH中的,
但使用“--jar”或者“--library”来传递这些似乎是更好的选择,这两个参数会把jar当做job的user
jar随JobGraph一起上传至JM。

flink-dist这个fat jar中应该已经有guava18.

Zeng, Jianqiang Zack  于2020年12月30日周三 下午5:04写道:

>
>
>
>
>
>
>
>
>
>
>
>
> Best Regards!
>
> Have a good day!
>
>
>
>
> *Zack Zeng *Associate Manager, Business Analyst
> Boston Scientific
> China Information Services
> jianqiang.z...@bsci.com
> (+86)21-61417831
> #763 Mengzi Road, Shanghai, China
> www.bostonscientific.com
>
> [image: bsc]
>
>
>
>
>
> *From:* Zeng, Jianqiang Zack
> *Sent:* Wednesday, December 30, 2020 4:42 PM
> *To:* user-zh@flink.apache.org
> *Subject:* flink 1.12.0 + hive 3.1.2 报错 java.lang.NoSuchMethodError:
> com.google.common.base.Preconditions.checkArgument
>
>
>
> 使用官网的Flink 1.12.0安装,已经正常启动,JPS可看到相关进程,WEBUI也正常启用,配置连接Hive 3.1.2,并将相关的JAR
> 包放入了flink下面的lib文件夹当中,但启动sql-client报错,搜索定位说是guava的问题,可是我guava的包是直接软链接到hive
> 下面的guava包,和hadoop也是共用同一个包,是哪里配置还有问题吗?相关截图如下,谢谢!
>
>
>
> JPS截图
>
>
>
> WebUI截图
>
>
>
> Flink/Lib 截图
>
>
>
> Sql-client截图
>
>
>
> *Sql-client启动报错截图*
>
>
>
> Hive正常启动截图
>
>
>
>
>
> Best Regards!
>
> Have a good day!
>
>
>
>
> *Zack Zeng *Associate Manager, Business Analyst
> Boston Scientific
> China Information Services
> jianqiang.z...@bsci.com
> (+86)21-61417831
> #763 Mengzi Road, Shanghai, China
> www.bostonscientific.com
>
> [image: bsc]
>
>
>
>
>


-- 

*With kind regards

Sebastian Liu 刘洋
Institute of Computing Technology, Chinese Academy of Science
Mobile\WeChat: +86—15201613655
E-mail: liuyang0...@gmail.com 
QQ: 3239559*


Re: 回撤流-窗口计算

2020-12-30 文章 孙啸龙
非常感谢回复,
疑问1:实时ETL中,涉及join的操作的很多,一join之后,聚合操作就只能用非窗口计算了吗,这样不是实时etl里基本不能用到窗口和interval 
join?
疑问2: 
Connector='upset-kafka’,读到的是回撤流,
如果后面的操作不能使用到窗口和interval join,是不是在这种情况下 watermark用不到?

> 在 2020年12月30日,下午8:31,hailongwang <18868816...@163.com> 写道:
> 
> 需要的 window size 大吗,可以使用 min-batch 的 no-window agg 绕过?
> 
> 
> Best,
> Hailong
> 在 2020-12-30 17:41:50,"孙啸龙"  写道:
>> Hi,大家好:
>> 
>> 版本:1.12.0
>> 方式:Flink sql
>> 问题:双流join后是回撤流,不能窗口计算,这种应用场景是怎么处理的?



flink如何使用oss作为checkpoint/savepoint/statebackend?

2020-12-30 文章 陈帅
请问flink如何使用oss作为checkpoint/savepoint/statebackend? 需要依赖Hadoop并配置Hadoop on OSS吗?


flink读写阿里云oss问题

2020-12-30 文章 陈帅
请问flink如何批/流读写阿里云oss? 我试着通过filesystem sql connector [1] 连接oss,按照官网配置oss 
endpoint, accessKey (id + secret)  [2] 在flink-conf.yaml文件,程序仍然报找不到 
fs.oss.endpoint,我已经将 flink-oss-fs-hadoop jar放进了 
plugins/目录下面。想问一下读写oss是否一定要依赖于hadoop呢?官网有提及 Hadoop Aliyun 
module,不知道具体做法是什么?还请给个具体实操例子,谢谢!


[1] 
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/filesystem.html
[2] 
https://ci.apache.org/projects/flink/flink-docs-master/deployment/filesystems/oss.html
[3] 
http://hadoop.apache.org/docs/current/hadoop-aliyun/tools/hadoop-aliyun/index.html





Re:回撤流-窗口计算

2020-12-30 文章 hailongwang
需要的 window size 大吗,可以使用 min-batch 的 no-window agg 绕过?


Best,
Hailong
在 2020-12-30 17:41:50,"孙啸龙"  写道:
>Hi,大家好:
>
>版本:1.12.0
>方式:Flink sql
>问题:双流join后是回撤流,不能窗口计算,这种应用场景是怎么处理的?


Re:Re:flink1.12 SQL Client 报错

2020-12-30 文章 jiangjiguang719
问题已经解决:
1、需要  flink-sql-connector-hive-2.3.6_2.11-1.12.0.jar  
去掉flink-connector-hive_2.11-1.12.0.jar和hive-exec-2.3.4.jar
2、不光要重启SQL Client 还需要重启本地集群

















在 2020-12-30 18:32:29,"hailongwang" <18868816...@163.com> 写道:
>你在启动之后才把 jar 包放进去的吗,重启下 SQL Client 试试?
>
>
>
>
>在 2020-12-30 15:26:59,"jiangjiguang719"  写道:
>>使用 SQL Client,进行hive查询时报错:
>>命名有了flink-connector-hive_2.11-1.12.0.jar,还是报java.lang.ClassNotFoundException: 
>>org.apache.flink.connectors.hive.HiveSource
>>麻烦看一下
>>
>>
>>报错信息:
>>
>>Flink SQL> select count(*) from zxw_test_1225_01;
>>2020-12-30 16:20:42,518 WARN  org.apache.hadoop.hive.conf.HiveConf
>> [] - HiveConf of name hive.spark.client.submit.timeout.interval 
>>does not exist
>>2020-12-30 16:20:42,519 WARN  org.apache.hadoop.hive.conf.HiveConf
>> [] - HiveConf of name hive.support.sql11.reserved.keywords does 
>>not exist
>>2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf
>> [] - HiveConf of name 
>>hive.spark.client.rpc.server.address.use.ip does not exist
>>2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf
>> [] - HiveConf of name hive.enforce.bucketing does not exist
>>2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf
>> [] - HiveConf of name hive.server2.enable.impersonation does not 
>>exist
>>2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf
>> [] - HiveConf of name hive.run.timeout.seconds does not exist
>>2020-12-30 16:20:43,065 WARN  
>>org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory  [] - The 
>>short-circuit local reads feature cannot be used because libhadoop cannot be 
>>loaded.
>>2020-12-30 16:20:43,245 INFO  org.apache.hadoop.mapred.FileInputFormat
>> [] - Total input files to process : 24
>>[ERROR] Could not execute SQL statement. Reason:
>>java.lang.ClassNotFoundException: org.apache.flink.connectors.hive.HiveSource
>>
>>
>>lib包:
>># tree lib
>>lib
>>├── flink-connector-hive_2.11-1.12.0.jar
>>├── flink-csv-1.12.0.jar
>>├── flink-dist_2.11-1.12.0.jar
>>├── flink-hadoop-compatibility_2.11-1.12.0.jar
>>├── flink-json-1.12.0.jar
>>├── flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
>>├── flink-shaded-zookeeper-3.4.14.jar
>>├── flink-table_2.11-1.12.0.jar
>>├── flink-table-blink_2.11-1.12.0.jar
>>├── hive-exec-2.3.4.jar
>>├── log4j-1.2-api-2.12.1.jar
>>├── log4j-api-2.12.1.jar
>>├── log4j-core-2.12.1.jar
>>└── log4j-slf4j-impl-2.12.1.jar


回复:自定义Connector 报错

2020-12-30 文章 superainbower
接着补充一下,在git上Cloudera/kudu 的pom文件里发现



com.stumbleupon
async
${async.version}



org.slf4j
slf4j-api





Not shaded or included in the client JAR because it's part of the public 
API,这应该就是Flink集群里不包含这个pulbic API 导致这个报错


| |
superainbower
|
|
superainbo...@163.com
|
签名由网易邮箱大师定制


在2020年12月30日 19:16,superainbower 写道:
补充一下,在git上Cloudera/kudu 下的 org.apache.kudu.client.AsyncKuduClient这个类里面,确实看到 
import com.stumbleupon.async.Callback
应该就是kudu需要引入的,很奇怪已经将整个client的jar 放到lib里面了,怎么还会缺少呢
| |
superainbower
|
|
superainbo...@163.com
|
签名由网易邮箱大师定制


在2020年12月30日 18:59,superainbower 写道:
hi,
1.在kudu-client.jar里,按你的方式grep不到,本地idea下启动只引入了 
这一个依赖,是可以运行,所以将这个kudu-client的依赖直接放到了lib下,难道还要引入其他依赖吗?
2.你说的对,close是可判空一下

在2020年12月30日 18:33,hailongwang 写道:
这个应该是 `kudu-client.jar` 里面应该打进去的吧。
可以看下 jar -tf kudu-client.jar | grep 'com.stumbleupon.async.Callback'
ps: 你的 close 方法有 npe,应该是 客户端还没构建出来,可以判空下。


Best,
Hailong
在 2020-12-30 17:24:48,"superainbower"  写道:
HI,大家好:
我有一个应用场景,利用Flinksql读取Kafka数据写入Kudu,由于官方没有Kudu Connector,自定义了一个Kudu Sink 
Connector,在本地IDEA下测试可以 正常跑通;
可是将代码打包,并将kudu-client.jar的依赖包放置Flink的lib目录下之后,提交任务到集群报错如下:


java.lang.NoClassDefFoundError: com/stumbleupon/async/Callback
at 
org.apache.kudu.client.AsyncKuduClient$AsyncKuduClientBuilder.build(AsyncKuduClient.java:2525)
 ~[kudu-client-1.9.0.jar:1.9.0]
at 
org.apache.kudu.client.KuduClient$KuduClientBuilder.build(KuduClient.java:533) 
~[kudu-client-1.9.0.jar:1.9.0]
at com.java.flink.sql.kudu.KuduSinkFunction.open(KuduSinkFunction.java:38) 
~[FlinkProject.jar:?]
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.table.runtime.operators.sink.SinkOperator.open(SinkOperator.java:63)
 ~[flink-table-blink_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:401)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:507)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:501)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:531) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
Suppressed: java.lang.NullPointerException
at com.java.flink.sql.kudu.KuduSinkFunction.close(KuduSinkFunction.java:86) 
~[FlinkProject.jar:?]
at 
org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:740)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:720)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:643)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:552) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
Caused by: java.lang.ClassNotFoundException: com.stumbleupon.async.Callback
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[?:1.8.0_162]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_162]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) ~[?:1.8.0_162]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_162]
... 14 more


不是很清楚这个 com.stumbleupon.async.Callback 类的依赖来自哪里,该怎么解决?
| |
superainbower
|
|
superainbo...@163.com
|
签名由网易邮箱大师定制









回复:自定义Connector 报错

2020-12-30 文章 superainbower
补充一下,在git上Cloudera/kudu 下的 org.apache.kudu.client.AsyncKuduClient这个类里面,确实看到 
import com.stumbleupon.async.Callback
应该就是kudu需要引入的,很奇怪已经将整个client的jar 放到lib里面了,怎么还会缺少呢
| |
superainbower
|
|
superainbo...@163.com
|
签名由网易邮箱大师定制


在2020年12月30日 18:59,superainbower 写道:
hi,
1.在kudu-client.jar里,按你的方式grep不到,本地idea下启动只引入了 
这一个依赖,是可以运行,所以将这个kudu-client的依赖直接放到了lib下,难道还要引入其他依赖吗?
2.你说的对,close是可判空一下

在2020年12月30日 18:33,hailongwang 写道:
这个应该是 `kudu-client.jar` 里面应该打进去的吧。
可以看下 jar -tf kudu-client.jar | grep 'com.stumbleupon.async.Callback'
ps: 你的 close 方法有 npe,应该是 客户端还没构建出来,可以判空下。


Best,
Hailong
在 2020-12-30 17:24:48,"superainbower"  写道:
>HI,大家好:
>我有一个应用场景,利用Flinksql读取Kafka数据写入Kudu,由于官方没有Kudu Connector,自定义了一个Kudu Sink 
>Connector,在本地IDEA下测试可以 正常跑通;
>可是将代码打包,并将kudu-client.jar的依赖包放置Flink的lib目录下之后,提交任务到集群报错如下:
>
>
>java.lang.NoClassDefFoundError: com/stumbleupon/async/Callback
>at 
> org.apache.kudu.client.AsyncKuduClient$AsyncKuduClientBuilder.build(AsyncKuduClient.java:2525)
>  ~[kudu-client-1.9.0.jar:1.9.0]
>at 
> org.apache.kudu.client.KuduClient$KuduClientBuilder.build(KuduClient.java:533)
>  ~[kudu-client-1.9.0.jar:1.9.0]
>at com.java.flink.sql.kudu.KuduSinkFunction.open(KuduSinkFunction.java:38) 
> ~[FlinkProject.jar:?]
>at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.table.runtime.operators.sink.SinkOperator.open(SinkOperator.java:63)
>  ~[flink-table-blink_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:401)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:507)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:501)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:531)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
>Suppressed: java.lang.NullPointerException
>at 
> com.java.flink.sql.kudu.KuduSinkFunction.close(KuduSinkFunction.java:86) 
> ~[FlinkProject.jar:?]
>at 
> org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:740)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:720)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:643)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:552)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
>Caused by: java.lang.ClassNotFoundException: com.stumbleupon.async.Callback
>at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
> ~[?:1.8.0_162]
>at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_162]
>at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) 
> ~[?:1.8.0_162]
>at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_162]
>... 14 more
>
>
>不是很清楚这个 com.stumbleupon.async.Callback 类的依赖来自哪里,该怎么解决?
>| |
>superainbower
>|
>|
>superainbo...@163.com
>|
>签名由网易邮箱大师定制
>








Flink cep可以自动化定义或者从数据中学习生成吗?还是得自己一条一条地添加pattern?

2020-12-30 文章 Xiali Wang
Dear All,

近期在学习flīnk
CEP的用法。产生了一个疑问。规则引擎现在用的还多吗?大家的规则都是自己一条一条定义的吗?有没有像drool那样,可以自动化生成一些规则模版呢?








-
Thanks!
Xiali Wang
--
Sent from: xialiwa...@gmail.com


回复:自定义Connector 报错

2020-12-30 文章 superainbower
hi,
1.在kudu-client.jar里,按你的方式grep不到,本地idea下启动只引入了 
这一个依赖,是可以运行,所以将这个kudu-client的依赖直接放到了lib下,难道还要引入其他依赖吗?
2.你说的对,close是可判空一下

在2020年12月30日 18:33,hailongwang 写道:
这个应该是 `kudu-client.jar` 里面应该打进去的吧。
可以看下 jar -tf kudu-client.jar | grep 'com.stumbleupon.async.Callback'
ps: 你的 close 方法有 npe,应该是 客户端还没构建出来,可以判空下。


Best,
Hailong
在 2020-12-30 17:24:48,"superainbower"  写道:
>HI,大家好:
>我有一个应用场景,利用Flinksql读取Kafka数据写入Kudu,由于官方没有Kudu Connector,自定义了一个Kudu Sink 
>Connector,在本地IDEA下测试可以 正常跑通;
>可是将代码打包,并将kudu-client.jar的依赖包放置Flink的lib目录下之后,提交任务到集群报错如下:
>
>
>java.lang.NoClassDefFoundError: com/stumbleupon/async/Callback
>at 
> org.apache.kudu.client.AsyncKuduClient$AsyncKuduClientBuilder.build(AsyncKuduClient.java:2525)
>  ~[kudu-client-1.9.0.jar:1.9.0]
>at 
> org.apache.kudu.client.KuduClient$KuduClientBuilder.build(KuduClient.java:533)
>  ~[kudu-client-1.9.0.jar:1.9.0]
>at com.java.flink.sql.kudu.KuduSinkFunction.open(KuduSinkFunction.java:38) 
> ~[FlinkProject.jar:?]
>at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.table.runtime.operators.sink.SinkOperator.open(SinkOperator.java:63)
>  ~[flink-table-blink_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:401)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:507)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:501)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:531)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
>Suppressed: java.lang.NullPointerException
>at 
> com.java.flink.sql.kudu.KuduSinkFunction.close(KuduSinkFunction.java:86) 
> ~[FlinkProject.jar:?]
>at 
> org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:740)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:720)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:643)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:552)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
>Caused by: java.lang.ClassNotFoundException: com.stumbleupon.async.Callback
>at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
> ~[?:1.8.0_162]
>at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_162]
>at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) 
> ~[?:1.8.0_162]
>at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_162]
>... 14 more
>
>
>不是很清楚这个 com.stumbleupon.async.Callback 类的依赖来自哪里,该怎么解决?
>| |
>superainbower
>|
>|
>superainbo...@163.com
>|
>签名由网易邮箱大师定制
>








回撤流-窗口计算

2020-12-30 文章 孙啸龙
Hi,大家好:

版本:1.12.0
方式:Flink sql
问题:双流join后是回撤流,不能窗口计算,这种应用场景是怎么处理的?

canal-json format优化问题

2020-12-30 文章 air23
在官方文档上 看到 canal-json format 有如下两个key 

canal-json.database.include 
canal-json.table.include
这2个key 看源码是equals等于关系,而不是include包含关系 ,是否这2个字段可以配置正则表达式来包含配置的表 来支持呢,如 
"'canal-json.table.include' = 'test*',"+

因为可能会用到分库分表的flink场景。
谢谢





Re:自定义Connector 报错

2020-12-30 文章 hailongwang
这个应该是 `kudu-client.jar` 里面应该打进去的吧。
可以看下 jar -tf kudu-client.jar | grep 'com.stumbleupon.async.Callback'
ps: 你的 close 方法有 npe,应该是 客户端还没构建出来,可以判空下。


Best,
Hailong
在 2020-12-30 17:24:48,"superainbower"  写道:
>HI,大家好:
>我有一个应用场景,利用Flinksql读取Kafka数据写入Kudu,由于官方没有Kudu Connector,自定义了一个Kudu Sink 
>Connector,在本地IDEA下测试可以 正常跑通;
>可是将代码打包,并将kudu-client.jar的依赖包放置Flink的lib目录下之后,提交任务到集群报错如下:
>
>
>java.lang.NoClassDefFoundError: com/stumbleupon/async/Callback
>   at 
> org.apache.kudu.client.AsyncKuduClient$AsyncKuduClientBuilder.build(AsyncKuduClient.java:2525)
>  ~[kudu-client-1.9.0.jar:1.9.0]
>   at 
> org.apache.kudu.client.KuduClient$KuduClientBuilder.build(KuduClient.java:533)
>  ~[kudu-client-1.9.0.jar:1.9.0]
>   at 
> com.java.flink.sql.kudu.KuduSinkFunction.open(KuduSinkFunction.java:38) 
> ~[FlinkProject.jar:?]
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.table.runtime.operators.sink.SinkOperator.open(SinkOperator.java:63)
>  ~[flink-table-blink_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:401)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:507)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:501)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:531)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
>   Suppressed: java.lang.NullPointerException
>   at 
> com.java.flink.sql.kudu.KuduSinkFunction.close(KuduSinkFunction.java:86) 
> ~[FlinkProject.jar:?]
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:740)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:720)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:643)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:552)
>  ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
> ~[flink-dist_2.11-1.12.0.jar:1.12.0]
>   at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
>Caused by: java.lang.ClassNotFoundException: com.stumbleupon.async.Callback
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
> ~[?:1.8.0_162]
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_162]
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) 
> ~[?:1.8.0_162]
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_162]
>   ... 14 more
>
>
>不是很清楚这个 com.stumbleupon.async.Callback 类的依赖来自哪里,该怎么解决?
>| |
>superainbower
>|
>|
>superainbo...@163.com
>|
>签名由网易邮箱大师定制
>






 

Re:flink1.12 SQL Client 报错

2020-12-30 文章 hailongwang
你在启动之后才把 jar 包放进去的吗,重启下 SQL Client 试试?




在 2020-12-30 15:26:59,"jiangjiguang719"  写道:
>使用 SQL Client,进行hive查询时报错:
>命名有了flink-connector-hive_2.11-1.12.0.jar,还是报java.lang.ClassNotFoundException: 
>org.apache.flink.connectors.hive.HiveSource
>麻烦看一下
>
>
>报错信息:
>
>Flink SQL> select count(*) from zxw_test_1225_01;
>2020-12-30 16:20:42,518 WARN  org.apache.hadoop.hive.conf.HiveConf 
>[] - HiveConf of name hive.spark.client.submit.timeout.interval 
>does not exist
>2020-12-30 16:20:42,519 WARN  org.apache.hadoop.hive.conf.HiveConf 
>[] - HiveConf of name hive.support.sql11.reserved.keywords does 
>not exist
>2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf 
>[] - HiveConf of name hive.spark.client.rpc.server.address.use.ip 
>does not exist
>2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf 
>[] - HiveConf of name hive.enforce.bucketing does not exist
>2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf 
>[] - HiveConf of name hive.server2.enable.impersonation does not 
>exist
>2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf 
>[] - HiveConf of name hive.run.timeout.seconds does not exist
>2020-12-30 16:20:43,065 WARN  
>org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory  [] - The 
>short-circuit local reads feature cannot be used because libhadoop cannot be 
>loaded.
>2020-12-30 16:20:43,245 INFO  org.apache.hadoop.mapred.FileInputFormat 
>[] - Total input files to process : 24
>[ERROR] Could not execute SQL statement. Reason:
>java.lang.ClassNotFoundException: org.apache.flink.connectors.hive.HiveSource
>
>
>lib包:
># tree lib
>lib
>├── flink-connector-hive_2.11-1.12.0.jar
>├── flink-csv-1.12.0.jar
>├── flink-dist_2.11-1.12.0.jar
>├── flink-hadoop-compatibility_2.11-1.12.0.jar
>├── flink-json-1.12.0.jar
>├── flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
>├── flink-shaded-zookeeper-3.4.14.jar
>├── flink-table_2.11-1.12.0.jar
>├── flink-table-blink_2.11-1.12.0.jar
>├── hive-exec-2.3.4.jar
>├── log4j-1.2-api-2.12.1.jar
>├── log4j-api-2.12.1.jar
>├── log4j-core-2.12.1.jar
>└── log4j-slf4j-impl-2.12.1.jar


自定义Connector 报错

2020-12-30 文章 superainbower
HI,大家好:
我有一个应用场景,利用Flinksql读取Kafka数据写入Kudu,由于官方没有Kudu Connector,自定义了一个Kudu Sink 
Connector,在本地IDEA下测试可以 正常跑通;
可是将代码打包,并将kudu-client.jar的依赖包放置Flink的lib目录下之后,提交任务到集群报错如下:


java.lang.NoClassDefFoundError: com/stumbleupon/async/Callback
at 
org.apache.kudu.client.AsyncKuduClient$AsyncKuduClientBuilder.build(AsyncKuduClient.java:2525)
 ~[kudu-client-1.9.0.jar:1.9.0]
at 
org.apache.kudu.client.KuduClient$KuduClientBuilder.build(KuduClient.java:533) 
~[kudu-client-1.9.0.jar:1.9.0]
at 
com.java.flink.sql.kudu.KuduSinkFunction.open(KuduSinkFunction.java:38) 
~[FlinkProject.jar:?]
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.table.runtime.operators.sink.SinkOperator.open(SinkOperator.java:63)
 ~[flink-table-blink_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:401)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:507)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:501)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:531) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
Suppressed: java.lang.NullPointerException
at 
com.java.flink.sql.kudu.KuduSinkFunction.close(KuduSinkFunction.java:86) 
~[FlinkProject.jar:?]
at 
org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:740)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:720)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:643)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:552) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
Caused by: java.lang.ClassNotFoundException: com.stumbleupon.async.Callback
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
~[?:1.8.0_162]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_162]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) 
~[?:1.8.0_162]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_162]
... 14 more


不是很清楚这个 com.stumbleupon.async.Callback 类的依赖来自哪里,该怎么解决?
| |
superainbower
|
|
superainbo...@163.com
|
签名由网易邮箱大师定制



Flink catalog+hive问题

2020-12-30 文章 guaishushu1...@163.com
在用flink   catalog+hive做元数据持久化的时候还存在几个问题
1. DDL的字段信息都在properties中导致字段无法增删改,只能重新建表;
2. 生成的表没有owner信息;
3. HMS的权限对于Flink + hive并没有作用,无权限也可以直接引用表;



guaishushu1...@163.com
 
发件人: 19916726683
发送时间: 2020-12-24 13:59
收件人: user-zh
主题: Re: Flink catalog+hive问题
可以参考下这个
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html
贴的代码是org.apache.hadoop.hive.io.HdfsUtils 的setFullFileStatus 方法
Original Message
Sender:Rui lilirui.fu...@gmail.com
Recipient:user-zhuser...@flink.apache.org
Date:Thursday, Dec 24, 2020 11:33
Subject:Re: Flink catalog+hive问题
 
 
Hello, 
你贴的图看不到了。可以贴一下参考的官网链接。hive至少支持三种不同的authorization模式,flink目前对接hive时只有用storage 
based authorization会生效。 On Thu, Dec 24, 2020 at 10:51 AM 19916726683 
19916726...@163.com wrote:  hive的官网有介绍ACL,如何继承权限关系。源码在Hive- HDFSUtils类中 
核心代码应该是上面的这点。   Original Message  *Sender:* Rui lilirui.fu...@gmail.com  
*Recipient:* user-zhuser...@flink.apache.org  *Date:* Wednesday, Dec 23, 2020 
19:41  *Subject:* Re: Flink catalog+hive问题   
hive的ACL用的是哪种呢?目前flink没有专门做ACL的对接,只有HMS端storage based authorization [1] 会生效   
[1]https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Authorization#LanguageManualAuthorization-1StorageBasedAuthorizationintheMetastoreServer
   On Wed, Dec 23, 2020 at 4:34 PM 19916726683 19916726...@163.com wrote:
spark是可以通过配置来确定是用hive的acl还是用自己的acl,不清楚flink是不是也是这种模式   Original Message   
Sender:guaishushu1103@163.comguaishushu1...@163.com   
Recipient:user-zhuser...@flink.apache.org   Date:Wednesday, Dec 23, 2020 15:53  
 Subject:Flink catalog+hive问题   在用flink   
catalog+hive做元数据持久化的时候,发现hive的ACL权限没有起作用,麻烦问下知道的大佬,flink是会直接跳过hive的ACL权限吗?   
guaishushu1...@163.com --  Best regards!  Rui Li   -- Best regards! Rui Li


RE: flink 1.12.0 + hive 3.1.2 报错 java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument

2020-12-30 文章 Zeng, Jianqiang Zack






Best Regards!
Have a good day!

Zack Zeng
Associate Manager, Business Analyst
Boston Scientific
China Information Services
jianqiang.z...@bsci.com
(+86)21-61417831
#763 Mengzi Road, Shanghai, China
www.bostonscientific.com
[bsc]


From: Zeng, Jianqiang Zack
Sent: Wednesday, December 30, 2020 4:42 PM
To: user-zh@flink.apache.org
Subject: flink 1.12.0 + hive 3.1.2 报错 java.lang.NoSuchMethodError: 
com.google.common.base.Preconditions.checkArgument

使用官网的Flink 1.12.0安装,已经正常启动,JPS可看到相关进程,WEBUI也正常启用,配置连接Hive 
3.1.2,并将相关的JAR包放入了flink下面的lib文件夹当中,但启动sql-client报错,搜索定位说是guava的问题,可是我guava的包是直接软链接到hive下面的guava包,和hadoop也是共用同一个包,是哪里配置还有问题吗?相关截图如下,谢谢!

JPS截图
[cid:image003.png@01D6DECA.EAC0C3F0]

WebUI截图
[cid:image004.png@01D6DECA.EAC0C3F0]

Flink/Lib 截图
[cid:image005.png@01D6DECA.EAC0C3F0]

Sql-client截图
[cid:image006.png@01D6DECA.EAC0C3F0]

Sql-client启动报错截图
[cid:image007.png@01D6DECA.EAC0C3F0]

Hive正常启动截图
[cid:image008.png@01D6DECA.EAC0C3F0]


Best Regards!
Have a good day!

Zack Zeng
Associate Manager, Business Analyst
Boston Scientific
China Information Services
jianqiang.z...@bsci.com
(+86)21-61417831
#763 Mengzi Road, Shanghai, China
www.bostonscientific.com
[bsc]




flink 1.12.0 + hive 3.1.2 报错 java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument

2020-12-30 文章 Zeng, Jianqiang Zack
使用官网的Flink 1.12.0安装,已经正常启动,JPS可看到相关进程,WEBUI也正常启用,配置连接Hive 
3.1.2,并将相关的JAR包放入了flink下面的lib文件夹当中,但启动sql-client报错,搜索定位说是guava的问题,可是我guava的包是直接软链接到hive下面的guava包,和hadoop也是共用同一个包,是哪里配置还有问题吗?相关截图如下,谢谢!

JPS截图
[cid:image001.png@01D6DEC9.837DC9A0]

WebUI截图
[cid:image005.png@01D6DECA.ACF07C50]

Flink/Lib 截图
[cid:image007.png@01D6DECA.ACF07C50]

Sql-client截图
[cid:image008.png@01D6DEC9.CF309A30]

Sql-client启动报错截图
[cid:image010.png@01D6DECA.1C6FB010]

Hive正常启动截图
[cid:image011.png@01D6DECA.ACF07C50]


Best Regards!
Have a good day!

Zack Zeng
Associate Manager, Business Analyst
Boston Scientific
China Information Services
jianqiang.z...@bsci.com
(+86)21-61417831
#763 Mengzi Road, Shanghai, China
www.bostonscientific.com
[bsc]




flink1.12 SQL Client 报错

2020-12-30 文章 jiangjiguang719
使用 SQL Client,进行hive查询时报错:
命名有了flink-connector-hive_2.11-1.12.0.jar,还是报java.lang.ClassNotFoundException: 
org.apache.flink.connectors.hive.HiveSource
麻烦看一下


报错信息:

Flink SQL> select count(*) from zxw_test_1225_01;
2020-12-30 16:20:42,518 WARN  org.apache.hadoop.hive.conf.HiveConf  
   [] - HiveConf of name hive.spark.client.submit.timeout.interval does 
not exist
2020-12-30 16:20:42,519 WARN  org.apache.hadoop.hive.conf.HiveConf  
   [] - HiveConf of name hive.support.sql11.reserved.keywords does not 
exist
2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf  
   [] - HiveConf of name hive.spark.client.rpc.server.address.use.ip 
does not exist
2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf  
   [] - HiveConf of name hive.enforce.bucketing does not exist
2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf  
   [] - HiveConf of name hive.server2.enable.impersonation does not 
exist
2020-12-30 16:20:42,520 WARN  org.apache.hadoop.hive.conf.HiveConf  
   [] - HiveConf of name hive.run.timeout.seconds does not exist
2020-12-30 16:20:43,065 WARN  
org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory  [] - The 
short-circuit local reads feature cannot be used because libhadoop cannot be 
loaded.
2020-12-30 16:20:43,245 INFO  org.apache.hadoop.mapred.FileInputFormat  
   [] - Total input files to process : 24
[ERROR] Could not execute SQL statement. Reason:
java.lang.ClassNotFoundException: org.apache.flink.connectors.hive.HiveSource


lib包:
# tree lib
lib
├── flink-connector-hive_2.11-1.12.0.jar
├── flink-csv-1.12.0.jar
├── flink-dist_2.11-1.12.0.jar
├── flink-hadoop-compatibility_2.11-1.12.0.jar
├── flink-json-1.12.0.jar
├── flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
├── flink-shaded-zookeeper-3.4.14.jar
├── flink-table_2.11-1.12.0.jar
├── flink-table-blink_2.11-1.12.0.jar
├── hive-exec-2.3.4.jar
├── log4j-1.2-api-2.12.1.jar
├── log4j-api-2.12.1.jar
├── log4j-core-2.12.1.jar
└── log4j-slf4j-impl-2.12.1.jar

Re: FlinkSQL是否支持设置窗口trigger实现continuious trigger呢

2020-12-30 文章 fan_future
这两个参数 
table.exec.emit.early-fire.enabled
table.exec.emit.early-fire.delay
是怎么设置的??

EnvironmentSettings build =
EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build();
TableEnvironment tEnv = TableEnvironment.create(build);

Configuration tableConfig = tEnv.getConfig().getConfiguration();
tableConfig.setString("table.exec.emit.early-fire.enabled","true");
tableConfig.setString("table.exec.emit.early-fire.delay","6");
我这样处理后就报错



--
Sent from: http://apache-flink.147419.n8.nabble.com/