Re:Re: Re:Re:Re:Re:Flink SQL No Watermark

2020-08-13 文章 Zhou Zach



Hi,试了,将并行度设置为2和kafka分区数9,都试了,都只有一个consumer有watermark,可能是因为我开了一个producer吧














在 2020-08-13 16:57:25,"Shengkai Fang"  写道:
>hi, watermark本来就是通过watermark assigner生成的。这是正常现象。
>我想问问 你有没有试过调大并行度来解决这个问题?因为不同partition的数据可能存在时间上的差异。
>
>Zhou Zach  于2020年8月13日周四 下午4:33写道:
>
>>
>>
>>
>> Hi forideal, Shengkai Fang,
>>
>> 加上env.disableOperatorChaining()之后,发现5个算子,
>>
>>
>>
>>
>> Source: TableSourceScan(table=[[default_catalog, default_database, user]],
>> fields=[uid, sex, age, created_time]) ->
>>
>> Calc(select=[uid, sex, age, created_time, () AS procTime,
>> TO_TIMESTAMP(((created_time / 1000) FROM_UNIXTIME _UTF-16LE'-MM-dd
>> HH:mm:ss')) AS eventTime]) ->
>>
>> WatermarkAssigner(rowtime=[eventTime], watermark=[(eventTime -
>> 3000:INTERVAL SECOND)]) ->
>>
>> Calc(select=[uid, sex, age, created_time]) ->
>>
>> Sink: Sink(table=[default_catalog.default_database.user_mysql],
>> fields=[uid, sex, age, created_time])
>> 但是,只有最后面两个算子有watermark,所以开启OperatorChaining后,因为前面3个没有watermark,整个chain的算子都没有watermark了,那么是不是就不能通过flink
>> ui来监控watermark了,就依赖第三方监控工具来看watermark?因为上生产,肯定要开OperatorChaining的
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-08-13 15:39:44,"forideal"  写道:
>> >Hi Zhou Zach:
>> >你可以试试 env.disableOperatorChaining();
>> >然后观察每个 op 的 watermark 情况。这样能够简单的看下具体的情况。
>> >> 我是怎么设置参数的
>> >我使用的是 Flink SQL Blink Planner,采用的设置方式和你一样
>> >tableEnv.getConfig().getConfiguration() .setString(key,
>> configs.getString(key, null));
>> >同时我在 source table 中定义了 WATERMARK FOR event_time AS event_time - INTERVAL
>> '10' SECOND
>> >
>> >Best forideal
>> >
>> >
>> >在 2020-08-13 15:20:13,"Zhou Zach"  写道:
>> >>
>> >>
>> >>
>> >>Hi forideal,
>> >>我也遇到了No Watermark问题,我也设置了table.exec.source.idle-timeout 参数,如下:
>> >>
>> >>
>> >>val streamExecutionEnv =
>> StreamExecutionEnvironment.getExecutionEnvironment
>> >>
>> streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
>> >>streamExecutionEnv.setStateBackend(new
>> RocksDBStateBackend("hdfs://nameservice1/flink/checkpoints"))
>> >>
>> >>val blinkEnvSettings =
>> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
>> >>val streamTableEnv =
>> StreamTableEnvironment.create(streamExecutionEnv, blinkEnvSettings)
>> >>
>> streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_MODE,CheckpointingMode.EXACTLY_ONCE)
>> >>
>> streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_INTERVAL,Duration.ofSeconds(20))
>> >>
>> streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_TIMEOUT,Duration.ofSeconds(900))
>> >>
>> >>
>> streamTableEnv.getConfig.getConfiguration.set(ExecutionConfigOptions.TABLE_EXEC_SOURCE_IDLE_TIMEOUT,"5s")
>> >>
>> >>
>> >>并且,任务的并行度设置了1(这样是不是就不会存在flink consumer不消费kafka数据的情况,kafka一直生产数据的前提下)
>> >>在flink ui上,仍然显示Watermark No data,问下,你是怎么设置参数的
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>在 2020-08-13 14:02:58,"forideal"  写道:
>> >>>大家好
>> >>>
>> >>>   问题的原因定位到了。
>> >>>   由于无法 debug codegen 生成的代码,即使我拿到线上的数据,开启了debug环境依然无法得到进展。
>> >>>   这个时候,我进行了 disable chain,观察 watermark
>> 的生成情况,看看到底在那个环节没有继续往下传递。(因为多个 op chain 在一起,不能确定到底是那个环节存在问题)
>> >>>   发现在  WatermarkAssigner(rowtime=[event_time],
>> watermark=[(event_ti...)这个 op 中部分 task 为 No watermark,由于这个op和source
>> chain在一起,导致这个vertex 对应的watermark无法显示只能是 no data。因为存在 group by 下游的 watermark
>> 为 min(parent task output watermark),所以下游是 No watermark。导致在查问题的时候,比较困难。
>> >>>   定位到由于 kafka 部分 partition 无数据导致 No watermark 加上
>> table.exec.source.idle-timeout = 10s 参数即可。
>> >>>   当然,如果能直接 debug codegen 生成的代码,那么这个问题的分析路径会更简单。我应该直接可以发现大部分 task
>> 可以生成 watermark,少部分 task 无 watermark,能够快速的减少debug的时间。当前使用  disable chain
>> 观察每个 op 的情况,对于 Flink sql 的 debug 有很大的便利之处,不知社区是否有相关参数帮助开发者。
>> >>>
>> 

Re:Re:Re:Re:Re:Flink SQL No Watermark

2020-08-13 文章 Zhou Zach



Hi forideal, Shengkai Fang,
   
加上env.disableOperatorChaining()之后,发现5个算子,




Source: TableSourceScan(table=[[default_catalog, default_database, user]], 
fields=[uid, sex, age, created_time]) -> 

Calc(select=[uid, sex, age, created_time, () AS procTime, 
TO_TIMESTAMP(((created_time / 1000) FROM_UNIXTIME _UTF-16LE'-MM-dd 
HH:mm:ss')) AS eventTime]) -> 

WatermarkAssigner(rowtime=[eventTime], watermark=[(eventTime - 3000:INTERVAL 
SECOND)]) -> 

Calc(select=[uid, sex, age, created_time]) -> 

Sink: Sink(table=[default_catalog.default_database.user_mysql], fields=[uid, 
sex, age, created_time])
但是,只有最后面两个算子有watermark,所以开启OperatorChaining后,因为前面3个没有watermark,整个chain的算子都没有watermark了,那么是不是就不能通过flink
 ui来监控watermark了,就依赖第三方监控工具来看watermark?因为上生产,肯定要开OperatorChaining的














在 2020-08-13 15:39:44,"forideal"  写道:
>Hi Zhou Zach:
>你可以试试 env.disableOperatorChaining();
>然后观察每个 op 的 watermark 情况。这样能够简单的看下具体的情况。
>> 我是怎么设置参数的
>我使用的是 Flink SQL Blink Planner,采用的设置方式和你一样
>tableEnv.getConfig().getConfiguration() .setString(key, configs.getString(key, 
>null));
>同时我在 source table 中定义了 WATERMARK FOR event_time AS event_time - INTERVAL '10' 
>SECOND
>
>Best forideal
>
>
>在 2020-08-13 15:20:13,"Zhou Zach"  写道:
>>
>>
>>
>>Hi forideal,
>>我也遇到了No Watermark问题,我也设置了table.exec.source.idle-timeout 参数,如下:
>>
>>
>>val streamExecutionEnv = 
>> StreamExecutionEnvironment.getExecutionEnvironment
>>
>> streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
>>streamExecutionEnv.setStateBackend(new 
>> RocksDBStateBackend("hdfs://nameservice1/flink/checkpoints"))
>>
>>val blinkEnvSettings = 
>> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
>>val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv, 
>> blinkEnvSettings)
>>
>> streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_MODE,CheckpointingMode.EXACTLY_ONCE)
>>
>> streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_INTERVAL,Duration.ofSeconds(20))
>>
>> streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_TIMEOUT,Duration.ofSeconds(900))
>>
>>
>> streamTableEnv.getConfig.getConfiguration.set(ExecutionConfigOptions.TABLE_EXEC_SOURCE_IDLE_TIMEOUT,"5s")
>>
>>
>>并且,任务的并行度设置了1(这样是不是就不会存在flink consumer不消费kafka数据的情况,kafka一直生产数据的前提下)
>>在flink ui上,仍然显示Watermark No data,问下,你是怎么设置参数的
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>在 2020-08-13 14:02:58,"forideal"  写道:
>>>大家好
>>>
>>>   问题的原因定位到了。
>>>   由于无法 debug codegen 生成的代码,即使我拿到线上的数据,开启了debug环境依然无法得到进展。
>>>   这个时候,我进行了 disable chain,观察 watermark 的生成情况,看看到底在那个环节没有继续往下传递。(因为多个 op 
>>> chain 在一起,不能确定到底是那个环节存在问题)
>>>   发现在  WatermarkAssigner(rowtime=[event_time], 
>>> watermark=[(event_ti...)这个 op 中部分 task 为 No watermark,由于这个op和source 
>>> chain在一起,导致这个vertex 对应的watermark无法显示只能是 no data。因为存在 group by 下游的 watermark 
>>> 为 min(parent task output watermark),所以下游是 No watermark。导致在查问题的时候,比较困难。
>>>   定位到由于 kafka 部分 partition 无数据导致 No watermark 加上  
>>> table.exec.source.idle-timeout = 10s 参数即可。
>>>   当然,如果能直接 debug codegen 生成的代码,那么这个问题的分析路径会更简单。我应该直接可以发现大部分 task 可以生成 
>>> watermark,少部分 task 无 watermark,能够快速的减少debug的时间。当前使用  disable chain 观察每个 op 
>>> 的情况,对于 Flink sql 的 debug 有很大的便利之处,不知社区是否有相关参数帮助开发者。
>>>
>>>
>>>Best forideal
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>在 2020-08-13 12:56:57,"forideal"  写道:
>>>>大家好
>>>>
>>>>
>>>>关于这个问题我进行了一些 debug,发现了 watermark 对应的一个 physical relnode 是 
>>>> StreamExecWatermarkAssigner
>>>>在translateToPlanInternal 中生成了如下一个 class 代码,
>>>>public final class WatermarkGenerator$2 extends 
>>>>org.apache.flink.table.runtime.generated.WatermarkGenerator { public 
>>>>WatermarkGenerator$2(Object[] references) throws Exception { } @Override 
>>>>public void open(org.apache.flink.configuration.Configuration parameters) 
>>>>throws Exception { } @Override public Long 
>>>>currentWatermark(org.apache.flink.table.dataformat.BaseRow row) throws 
>>>>Exception { org.apache.flink.table.dataformat.SqlTimestamp field$3; boolean 
>>>>isNull$3; boolean isNull$4; org.apache.flink.table.dataformat.SqlTim

Re:Re:Re:Flink SQL No Watermark

2020-08-13 文章 Zhou Zach



Hi forideal,
我也遇到了No Watermark问题,我也设置了table.exec.source.idle-timeout 参数,如下:


val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
streamExecutionEnv.setStateBackend(new 
RocksDBStateBackend("hdfs://nameservice1/flink/checkpoints"))

val blinkEnvSettings = 
EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv, 
blinkEnvSettings)

streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_MODE,CheckpointingMode.EXACTLY_ONCE)

streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_INTERVAL,Duration.ofSeconds(20))

streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_TIMEOUT,Duration.ofSeconds(900))


streamTableEnv.getConfig.getConfiguration.set(ExecutionConfigOptions.TABLE_EXEC_SOURCE_IDLE_TIMEOUT,"5s")


并且,任务的并行度设置了1(这样是不是就不会存在flink consumer不消费kafka数据的情况,kafka一直生产数据的前提下)
在flink ui上,仍然显示Watermark No data,问下,你是怎么设置参数的














在 2020-08-13 14:02:58,"forideal"  写道:
>大家好
>
>   问题的原因定位到了。
>   由于无法 debug codegen 生成的代码,即使我拿到线上的数据,开启了debug环境依然无法得到进展。
>   这个时候,我进行了 disable chain,观察 watermark 的生成情况,看看到底在那个环节没有继续往下传递。(因为多个 op 
> chain 在一起,不能确定到底是那个环节存在问题)
>   发现在  WatermarkAssigner(rowtime=[event_time], watermark=[(event_ti...)这个 
> op 中部分 task 为 No watermark,由于这个op和source chain在一起,导致这个vertex 
> 对应的watermark无法显示只能是 no data。因为存在 group by 下游的 watermark 为 min(parent task 
> output watermark),所以下游是 No watermark。导致在查问题的时候,比较困难。
>   定位到由于 kafka 部分 partition 无数据导致 No watermark 加上  
> table.exec.source.idle-timeout = 10s 参数即可。
>   当然,如果能直接 debug codegen 生成的代码,那么这个问题的分析路径会更简单。我应该直接可以发现大部分 task 可以生成 
> watermark,少部分 task 无 watermark,能够快速的减少debug的时间。当前使用  disable chain 观察每个 op 
> 的情况,对于 Flink sql 的 debug 有很大的便利之处,不知社区是否有相关参数帮助开发者。
>
>
>Best forideal
>
>
>
>
>
>
>
>
>在 2020-08-13 12:56:57,"forideal"  写道:
>>大家好
>>
>>
>>关于这个问题我进行了一些 debug,发现了 watermark 对应的一个 physical relnode 是 
>> StreamExecWatermarkAssigner
>>在translateToPlanInternal 中生成了如下一个 class 代码,
>>public final class WatermarkGenerator$2 extends 
>>org.apache.flink.table.runtime.generated.WatermarkGenerator { public 
>>WatermarkGenerator$2(Object[] references) throws Exception { } @Override 
>>public void open(org.apache.flink.configuration.Configuration parameters) 
>>throws Exception { } @Override public Long 
>>currentWatermark(org.apache.flink.table.dataformat.BaseRow row) throws 
>>Exception { org.apache.flink.table.dataformat.SqlTimestamp field$3; boolean 
>>isNull$3; boolean isNull$4; org.apache.flink.table.dataformat.SqlTimestamp 
>>result$5; isNull$3 = row.isNullAt(12); field$3 = null; if (!isNull$3) { 
>>field$3 = row.getTimestamp(12, 3); } isNull$4 = isNull$3 || false; result$5 = 
>>null; if (!isNull$4) { result$5 = 
>>org.apache.flink.table.dataformat.SqlTimestamp.fromEpochMillis(field$3.getMillisecond()
>> - ((long) 1L), field$3.getNanoOfMillisecond()); } if (isNull$4) { return 
>>null; } else { return result$5.getMillisecond(); } } @Override public void 
>>close() throws Exception { } } 
>> 
>>
>>
>>   其中关键的信息是 result$5 = 
>> org.apache.flink.table.dataformat.SqlTimestamp.fromEpochMillis(field$3.getMillisecond()
>>  - ((long) 1L), field$3.getNanoOfMillisecond());
>>确实按照 WATERMARK FOR event_time AS event_time - INTERVAL '10' SECOND 的定义获取的 
>>watermark。
>>在 flink 的 graph 中也确实有对应的 op 在做这个事情,不知为何会出现 no watermark 
>>这样的结果。因为这部分codegen的代码确实无法进一步debug了。
>>如果大家有什么好的 debug codegen 生成的代码,可以告诉我哈,非常感谢
>>
>>  Best forideal
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>在 2020-08-11 17:13:01,"forideal"  写道:
>>>大家好,请教一个问题
>>>
>>>
>>>   我有一条进行 session window 的 sql。这条 sql 消费较少数据量的 topic 的时候,是可以生成 
>>> watermark。消费大量的数据的时候,就无法生成watermark。
>>>   一直是No Watermark。 暂时找不到排查问题的思路。
>>>  Flink 版本号是 1.10,kafka 中消息是有时间的,其他的任务是可以拿到这个时间生成watermark。同时设置了 
>>> EventTime mode 模式,Blink Planner。
>>>|
>>>No Watermark |
>>>   SQL如下
>>>
>>>
>>>  DDL:
>>>   create table test(
>>>   user_id varchar,
>>>   action varchar,
>>>   event_time TIMESTAMP(3),
>>>   WATERMARK FOR event_time AS event_time - INTERVAL 
>>> '10' SECOND
>>>   ) with();
>>>
>>>
>>>  DML:
>>>insert into
>>>  console
>>>select
>>>  user_id,
>>>  f_get_str(bind_id) as id_list
>>>from
>>>  (
>>>select
>>>  action as bind_id,
>>>  user_id,
>>>  event_time
>>>from
>>>  (
>>>SELECT
>>>  user_id,
>>>  action,
>>>  PROCTIME() as proc_time,
>>>  event_time
>>>FROM
>>>  test
>>>  ) T
>>>where
>>>  user_id is not null
>>>  and user_id <> ''
>>>  and CHARACTER_LENGTH(user_id) = 24
>>>  ) T
>>>group by
>>>  

flink run-application 怎样设置配置文件的环境变量

2020-08-03 文章 Zhou Zach
Hi all,

通过如下方式设置HBASE_CONF_PATH变量,提交到yarn时,发现HBASE_CONF_PATH没有生效,


/opt/flink-1.11.1/bin/flink run-application -t yarn-application \
-DHBASE_CONF_PATH='/etc/hbase/conf' \


请问flink提交job时,怎样设置环境变量?

回复:flink sql 1.11 kafka source with子句使用新参数,下游消费不到数据

2020-07-23 文章 Zhou Zach
Hi,
感谢详细答疑!



| |
Zhou Zach
|
|
邮箱:wander...@163.com
|

签名由 网易邮箱大师 定制

在2020年07月24日 11:48,Leonard Xu 写道:
Hi

"2020-07-23T19:53:15.509Z” 是 RFC-3339 格式,这个格式是带zone的时间格式,对应的数据类型是 timestamp 
with local zone,这个应该在1.12里支持了[1]
1.10版本虽然是支持 RFC-3339 格式,但默认解析时区是有问题的,所以在1.11和1.12逐步中纠正了。

在1.11版本中,如果json数据是RFC-3339格式,你可以把这个字段当成string读出来,在计算列中用个UDF自己解析到需要的timestamp。

Best
Leonard Xu
[1] https://issues.apache.org/jira/browse/FLINK-18296 
<https://issues.apache.org/jira/browse/FLINK-18296;

> 在 2020年7月24日,10:39,Zhou Zach  写道:
>
> Hi,
>
>
> 按照提示修改了,还是报错的:
>
>
> Query:
>
>
>   val streamExecutionEnv = 
> StreamExecutionEnvironment.getExecutionEnvironment
>
> streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
>streamExecutionEnv.setStateBackend(new 
> RocksDBStateBackend("hdfs://nameservice1/flink/checkpoints"))
>
>val blinkEnvSettings = 
> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
>val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv, 
> blinkEnvSettings)
>
> streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_MODE,CheckpointingMode.EXACTLY_ONCE)
>
> streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_INTERVAL,Duration.ofSeconds(20))
>
> streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_TIMEOUT,Duration.ofSeconds(900))
>
>
>streamTableEnv.executeSql(
>  """
>|
>|CREATE TABLE kafka_table (
>|uid BIGINT,
>|sex VARCHAR,
>|age INT,
>|created_time TIMESTAMP(3),
>|procTime AS PROCTIME(),
>|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
>|) WITH (
>|'connector' = 'kafka',
>|'topic' = 'user',
>|'properties.bootstrap.servers' = 'cdh1:9092,cdh2:9092,cdh3:9092',
>|'properties.group.id' = 'user_flink',
>|'scan.startup.mode' = 'latest-offset',
>|'format' = 'json',
>|'json.fail-on-missing-field' = 'false',
>|'json.ignore-parse-errors' = 'true',
>|'json.timestamp-format.standard' = 'ISO-8601'
>|)
>|""".stripMargin)
>
>streamTableEnv.executeSql(
>  """
>|
>|CREATE TABLE print_table
>|(
>|uid BIGINT,
>|sex VARCHAR,
>|age INT,
>|created_time TIMESTAMP(3)
>|)
>|WITH ('connector' = 'print')
>|
>|
>|""".stripMargin)
>
>streamTableEnv.executeSql(
>  """
>|insert into print_table
>|SELECT
>|   uid,sex,age,created_time
>|FROM  kafka_table
>|
>|""".stripMargin)
>
>
> 堆栈:
>
>
> 2020-07-2410:33:32,852INFO  
> org.apache.flink.kafka.shaded.org.apache.kafka.common.utils.AppInfoParser [] 
> - Kafka startTimeMs: 1595558012852
> 2020-07-2410:33:32,853INFO  
> org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.KafkaConsumer 
> [] - [Consumer clientId=consumer-user_flink-12, groupId=user_flink] 
> Subscribed to partition(s): user-0
> 2020-07-2410:33:32,853INFO  
> org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.KafkaConsumer 
> [] - [Consumer clientId=consumer-user_flink-12, groupId=user_flink] Seeking 
> to offset 36627for partition user-0
> 2020-07-2410:33:32,860INFO  
> org.apache.flink.kafka.shaded.org.apache.kafka.clients.Metadata [] - 
> [Consumer clientId=consumer-user_flink-12, groupId=user_flink] ClusterID: 
> cAT_xBISQNWghT9kR5UuIw
> 2020-07-2410:33:32,871WARN  org.apache.flink.runtime.taskmanager.Task 
>[] - Source: TableSourceScan(table=[[default_catalog, 
> default_database, kafka_table]], fields=[uid, sex, age, created_time]) -> 
> Calc(select=[uid, sex, age, created_time, () AS procTime]) -> 
> WatermarkAssigner(rowtime=[created_time], watermark=[(created_time - 
> 3000:INTERVALSECOND)]) -> Calc(select=[uid, sex, age, created_time]) -> Sink: 
> Sink(table=[default_catalog.default_database.print_table], fields=[uid, sex, 
> age, created_time]) (2/4) (6b585139c083982beb6997e1ae2041ed) switched 
> fromRUNNING to FAILED.
> java.lang.RuntimeException: RowTime field should not be null, please convert 
> it to a non-nulllong value.
>at 
> org.apache.flink.table.runtime.operators.wmassigners.WatermarkAssignerOperator.processElement(WatermarkAssignerOperator.java:115)
>  ~[flink-table-blink

Re:Re: flink sql 1.11 kafka source with子句使用新参数,下游消费不到数据

2020-07-23 文章 Zhou Zach
Hi,


按照提示修改了,还是报错的:


Query:


   val streamExecutionEnv = 
StreamExecutionEnvironment.getExecutionEnvironment

streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
streamExecutionEnv.setStateBackend(new 
RocksDBStateBackend("hdfs://nameservice1/flink/checkpoints"))

val blinkEnvSettings = 
EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv, 
blinkEnvSettings)

streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_MODE,CheckpointingMode.EXACTLY_ONCE)

streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_INTERVAL,Duration.ofSeconds(20))

streamTableEnv.getConfig.getConfiguration.set(ExecutionCheckpointingOptions.CHECKPOINTING_TIMEOUT,Duration.ofSeconds(900))


streamTableEnv.executeSql(
  """
|
|CREATE TABLE kafka_table (
|uid BIGINT,
|sex VARCHAR,
|age INT,
|created_time TIMESTAMP(3),
|procTime AS PROCTIME(),
|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
|) WITH (
|'connector' = 'kafka',
|'topic' = 'user',
|'properties.bootstrap.servers' = 'cdh1:9092,cdh2:9092,cdh3:9092',
|'properties.group.id' = 'user_flink',
|'scan.startup.mode' = 'latest-offset',
|'format' = 'json',
|'json.fail-on-missing-field' = 'false',
|'json.ignore-parse-errors' = 'true',
|'json.timestamp-format.standard' = 'ISO-8601'
|)
|""".stripMargin)

streamTableEnv.executeSql(
  """
|
|CREATE TABLE print_table
|(
|uid BIGINT,
|sex VARCHAR,
|age INT,
|created_time TIMESTAMP(3)
|)
|WITH ('connector' = 'print')
|
|
|""".stripMargin)

streamTableEnv.executeSql(
  """
|insert into print_table
|SELECT
|   uid,sex,age,created_time
|FROM  kafka_table
|
|""".stripMargin)


堆栈:


2020-07-2410:33:32,852INFO  
org.apache.flink.kafka.shaded.org.apache.kafka.common.utils.AppInfoParser [] - 
Kafka startTimeMs: 1595558012852
2020-07-2410:33:32,853INFO  
org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.KafkaConsumer 
[] - [Consumer clientId=consumer-user_flink-12, groupId=user_flink] Subscribed 
to partition(s): user-0
2020-07-2410:33:32,853INFO  
org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.KafkaConsumer 
[] - [Consumer clientId=consumer-user_flink-12, groupId=user_flink] Seeking to 
offset 36627for partition user-0
2020-07-2410:33:32,860INFO  
org.apache.flink.kafka.shaded.org.apache.kafka.clients.Metadata [] - [Consumer 
clientId=consumer-user_flink-12, groupId=user_flink] ClusterID: 
cAT_xBISQNWghT9kR5UuIw
2020-07-2410:33:32,871WARN  org.apache.flink.runtime.taskmanager.Task   
 [] - Source: TableSourceScan(table=[[default_catalog, 
default_database, kafka_table]], fields=[uid, sex, age, created_time]) -> 
Calc(select=[uid, sex, age, created_time, () AS procTime]) -> 
WatermarkAssigner(rowtime=[created_time], watermark=[(created_time - 
3000:INTERVALSECOND)]) -> Calc(select=[uid, sex, age, created_time]) -> Sink: 
Sink(table=[default_catalog.default_database.print_table], fields=[uid, sex, 
age, created_time]) (2/4) (6b585139c083982beb6997e1ae2041ed) switched 
fromRUNNING to FAILED.
java.lang.RuntimeException: RowTime field should not be null, please convert it 
to a non-nulllong value.
at 
org.apache.flink.table.runtime.operators.wmassigners.WatermarkAssignerOperator.processElement(WatermarkAssignerOperator.java:115)
 ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]

















在 2020-07-23 21:23:28,"Leonard Xu"  写道:
>Hi
>
>这是1.11里的一个 json format t的不兼容改动[1],目的是支持更多的 timestamp format 
>的解析,你可以把json-timestamp-format-standard 
><https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/formats/json.html#json-timestamp-format-standard>设置成
> “ISO-8601”,应该就不用改动了。
>
>
>Best
>Leonard Xu
>[1] 
>https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/formats/json.html#json-timestamp-format-standard
> 
><https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/formats/json.html#json-timestamp-format-standard>
>
>> 在 2020年7月23日,20:54,Zhou Zach  写道:
>> 
>> 当前作业有个sink 
>> connector消费不到数据,我找到原因了,根本原因是kafka中时间字段的问题,只是with子句新旧参数对相同的字段数据表现了不同的行为,kafka中的消息格式:
>> 
>> 
>> {"uid":46,"sex":"female","age":11,"cre

Re:Re: flink sql 1.11 kafka source with子句使用新参数,下游消费不到数据

2020-07-23 文章 Zhou Zach
当前作业有个sink 
connector消费不到数据,我找到原因了,根本原因是kafka中时间字段的问题,只是with子句新旧参数对相同的字段数据表现了不同的行为,kafka中的消息格式:


{"uid":46,"sex":"female","age":11,"created_time":"2020-07-23T19:53:15.509Z"}
奇怪的是,在kafka_table DDL中,created_time 
定义为TIMESTAMP(3),with使用老参数是可以成功运行的,with使用新参数,在IDEA中运行没有任何异常,提交到yarn上,会报异常:
java.lang.RuntimeException: RowTime field should not be null, please convert it 
to a non-nulllong value.
at 
org.apache.flink.table.runtime.operators.wmassigners.WatermarkAssignerOperator.processElement(WatermarkAssignerOperator.java:115)
 ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]


在本地用如下函数测试,结果确实是NULL
TO_TIMESTAMP('2020-07-23T19:53:15.509Z')
kafka producuer将created_time字段设置为整型,或者 “2020-07-23 
20:36:55.565”,with使用新参数是没有问题的。调了一下午,调到怀疑人生,还好发现问题











在 2020-07-23 20:10:43,"Leonard Xu"  写道:
>Hi
>
>你说的下游消费不到数据,这个下游是指当前作业消费不到数据吗?
>
>正常应该不会的,可以提供个可复现代码吗? 
>
>祝好
>Leonard Xu
>
>
>> 在 2020年7月23日,18:13,Zhou Zach  写道:
>> 
>> Hi all,
>> 
>> 根据文档https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/kafka.html#start-reading-position,
>> 使用新参数创建kafka_table,下游消费不到数据,使用老参数下游可以消费到数据,是不是新参数的方式有坑啊
>> 
>> 
>> 老参数:
>>streamTableEnv.executeSql(
>>  """
>>|
>>|CREATE TABLE kafka_table (
>>|uid BIGINT,
>>|sex VARCHAR,
>>|age INT,
>>|created_time TIMESTAMP(3),
>>|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
>>|) WITH (
>>|
>>| 'connector.type' = 'kafka',
>>|'connector.version' = 'universal',
>>|'connector.topic' = 'user',
>>|'connector.startup-mode' = 'latest-offset',
>>|'connector.properties.zookeeper.connect' = 
>> 'cdh1:2181,cdh2:2181,cdh3:2181',
>>|'connector.properties.bootstrap.servers' = 
>> 'cdh1:9092,cdh2:9092,cdh3:9092',
>>|'connector.properties.group.id' = 'user_flink',
>>|'format.type' = 'json',
>>|'format.derive-schema' = 'true'
>>|
>>|)
>>|""".stripMargin)
>> 
>> 新参数:
>> 
>>streamTableEnv.executeSql(
>>  """
>>|
>>|CREATE TABLE kafka_table (
>>|
>>|uid BIGINT,
>>|sex VARCHAR,
>>|age INT,
>>|created_time TIMESTAMP(3),
>>|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
>>|) WITH (
>>|'connector' = 'kafka',
>>| 'topic' = 'user',
>>|'properties.bootstrap.servers' = 'cdh1:9092,cdh2:9092,cdh3:9092',
>>|'properties.group.id' = 'user_flink',
>>|'scan.startup.mode' = 'latest-offset',
>>|'format' = 'json',
>>|'json.fail-on-missing-field' = 'false',
>>|'json.ignore-parse-errors' = 'true'
>>|)
>>|""".stripMargin)


flink sql 1.11 kafka source with子句使用新参数,下游消费不到数据

2020-07-23 文章 Zhou Zach
Hi all,

根据文档https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/kafka.html#start-reading-position,
使用新参数创建kafka_table,下游消费不到数据,使用老参数下游可以消费到数据,是不是新参数的方式有坑啊


老参数:
streamTableEnv.executeSql(
  """
|
|CREATE TABLE kafka_table (
|uid BIGINT,
|sex VARCHAR,
|age INT,
|created_time TIMESTAMP(3),
|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
|) WITH (
|
| 'connector.type' = 'kafka',
|'connector.version' = 'universal',
|'connector.topic' = 'user',
|'connector.startup-mode' = 'latest-offset',
|'connector.properties.zookeeper.connect' = 
'cdh1:2181,cdh2:2181,cdh3:2181',
|'connector.properties.bootstrap.servers' = 
'cdh1:9092,cdh2:9092,cdh3:9092',
|'connector.properties.group.id' = 'user_flink',
|'format.type' = 'json',
|'format.derive-schema' = 'true'
|
|)
|""".stripMargin)

新参数:

streamTableEnv.executeSql(
  """
|
|CREATE TABLE kafka_table (
|
|uid BIGINT,
|sex VARCHAR,
|age INT,
|created_time TIMESTAMP(3),
|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
|) WITH (
|'connector' = 'kafka',
| 'topic' = 'user',
|'properties.bootstrap.servers' = 'cdh1:9092,cdh2:9092,cdh3:9092',
|'properties.group.id' = 'user_flink',
|'scan.startup.mode' = 'latest-offset',
|'format' = 'json',
|'json.fail-on-missing-field' = 'false',
|'json.ignore-parse-errors' = 'true'
|)
|""".stripMargin)

Re:Re:回复:flink1.11 set yarn slots failed

2020-07-16 文章 Zhou Zach
nice, 可以不用看Command-Line Interface的文档了

















在 2020-07-16 16:16:00,"xiao cai"  写道:
>可以看这里https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html
>
> 原始邮件 
>发件人: Zhou Zach
>收件人: user-zh
>发送时间: 2020年7月16日(周四) 15:28
>主题: Re:回复:flink1.11 set yarn slots failed
>
>
>-D前缀好使,要设置yarn name用什么参数啊,1.11官网的文档有些都不好使了 在 2020-07-16 15:03:14,"flinkcx" 
> 写道: >是不是应该用-D作为前缀来设置,比如-Dtaskmanager.numberOfTaskSlots=4 > > 
>> 原始邮件 >发件人: Zhou Zach >收件人: Flink user-zh mailing 
>list >发送时间: 2020年7月16日(周四) 14:51 >主题: flink1.11 set 
>yarn slots failed > > >Hi all, 使用如下命令,设置Number of slots per TaskManager 
>/opt/flink-1.11.0/bin/flink run-application -t yarn-application \ 
>-Djobmanager.memory.process.size=1024m \ 
>-Dtaskmanager.memory.process.size=2048m \ -ys 4 \ 
>发现并不能override/opt/flink-1.11.0/bin/flink/conf/flink-conf.yaml中的默认值,每次要调整只能通过更改flink-conf.yaml的方式才能生效,请问使用run-application的方式,怎样设置Number
> of slots per TaskManager? 另外,有哪些方式可以增Flink UI中的大Available Task 
>Slots的值,现在每次提交作业都是0


Re:回复:flink1.11 set yarn slots failed

2020-07-16 文章 Zhou Zach
-D前缀好使,要设置yarn name用什么参数啊,1.11官网的文档有些都不好使了

















在 2020-07-16 15:03:14,"flinkcx"  写道:
>是不是应该用-D作为前缀来设置,比如-Dtaskmanager.numberOfTaskSlots=4
>
>
> 原始邮件 
>发件人: Zhou Zach
>收件人: Flink user-zh mailing list
>发送时间: 2020年7月16日(周四) 14:51
>主题: flink1.11 set yarn slots failed
>
>
>Hi all, 使用如下命令,设置Number of slots per TaskManager /opt/flink-1.11.0/bin/flink 
>run-application -t yarn-application \ -Djobmanager.memory.process.size=1024m \ 
>-Dtaskmanager.memory.process.size=2048m \ -ys 4 \ 
>发现并不能override/opt/flink-1.11.0/bin/flink/conf/flink-conf.yaml中的默认值,每次要调整只能通过更改flink-conf.yaml的方式才能生效,请问使用run-application的方式,怎样设置Number
> of slots per TaskManager? 另外,有哪些方式可以增Flink UI中的大Available Task 
>Slots的值,现在每次提交作业都是0


flink1.11 set yarn slots failed

2020-07-16 文章 Zhou Zach
Hi all,


使用如下命令,设置Number of slots per TaskManager
/opt/flink-1.11.0/bin/flink run-application -t yarn-application \
-Djobmanager.memory.process.size=1024m \
-Dtaskmanager.memory.process.size=2048m \
 -ys 4 \


发现并不能override/opt/flink-1.11.0/bin/flink/conf/flink-conf.yaml中的默认值,每次要调整只能通过更改flink-conf.yaml的方式才能生效,请问使用run-application的方式,怎样设置Number
 of slots per TaskManager?
另外,有哪些方式可以增Flink UI中的大Available Task Slots的值,现在每次提交作业都是0

flink sql 1.11 create hive table error

2020-07-15 文章 Zhou Zach
Hi all,
flink sql 1.11 create table 是不是 不支持 IF NOT EXISTS


Query:
val hiveConfDir = "/etc/hive/conf" 
val hiveVersion = "2.1.1"

val odsCatalog = "odsCatalog"
val odsHiveCatalog = new HiveCatalog(odsCatalog, "ods", hiveConfDir, 
hiveVersion)
streamTableEnv.registerCatalog(odsCatalog, odsHiveCatalog)

streamTableEnv.getConfig.setSqlDialect(SqlDialect.HIVE)
streamTableEnv.executeSql(
  """
|
|CREATE TABLE IF NOT EXISTS odsCatalog.ods.hive_table (
|  user_id STRING,
|  age INT
|) PARTITIONED BY (dt STRING, hr STRING) STORED AS parquet 
TBLPROPERTIES (
|  'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
|  'sink.partition-commit.trigger'='partition-time',
|  'sink.partition-commit.delay'='0s',
|  'sink.partition-commit.policy.kind'='metastore'
|)
|
|""".stripMargin)












java.util.concurrent.CompletionException: 
org.apache.flink.client.deployment.application.ApplicationExecutionException: 
Could not execute application.
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
 ~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
 ~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943) 
~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
 ~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) 
~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
 ~[?:1.8.0_161]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:245)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.lambda$runApplicationAsync$1(ApplicationDispatcherBootstrap.java:199)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_161]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_161]
at 
org.apache.flink.runtime.concurrent.akka.ActorSystemScheduledExecutorAdapter$ScheduledFutureTask.run(ActorSystemScheduledExecutorAdapter.java:154)
 [data-flow-1.0.jar:?]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) 
[data-flow-1.0.jar:?]
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
 [data-flow-1.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
[data-flow-1.0.jar:?]
at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
[data-flow-1.0.jar:?]
at 
akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
[data-flow-1.0.jar:?]
at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 
[data-flow-1.0.jar:?]
Caused by: 
org.apache.flink.client.deployment.application.ApplicationExecutionException: 
Could not execute application.
... 11 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: The main 
method caused an error: SQL parse failed. Encountered "NOT" at line 3, column 
17.
Was expecting one of:
 
"ROW" ...
"COMMENT" ...
"LOCATION" ...
"PARTITIONED" ...
"STORED" ...
"TBLPROPERTIES" ...
"(" ...
"." ...

at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:302)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:149) 
~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:230)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
... 10 more
Caused by: org.apache.flink.table.api.SqlParserException: SQL parse failed. 
Encountered "NOT" at line 3, column 17.
Was expecting one of:
 
"ROW" ...
"COMMENT" ...
"LOCATION" ...
"PARTITIONED" ...
"STORED" ...
"TBLPROPERTIES" ...
"(" ...
"." ...

at 
org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:56)
 ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:76) 
~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:678)
 

Re:Re: flink1.11 sink hive error

2020-07-14 文章 Zhou Zach
Hi,


我刚才把flink sink的hive table,hive hdfs目录都删了,hbase表数据也清空了(hbase 通过hue hive 
table方式查询),然后重启程序,就可以了,
等再出问题,我试下你这种方法,感谢答疑!

















在 2020-07-14 20:42:16,"Leonard Xu"  写道:
>Hi,
>你安装 hive 的 metastore 后,在你 hivehome/conf/hive-site.xml 文件中添加这样一个配置:
>  
>hive.metastore.uris
>thrift://:9083
>Thrift URI for the remote metastore. Used by metastore client 
> to connect to remote metastore.
>  
>一般生产环境应该也是这样配置,
>然后 Flink 对接到hive配置参考[1],应该和你之前用的没啥变化,就是不支持 embedded 的 metastore
>
>祝好,
>Leonard Xu
>[1] 
>https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/#connecting-to-hive
> 
><https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/#connecting-to-hive>
>
>> 在 2020年7月14日,20:29,Zhou Zach  写道:
>> 
>> Hi,
>> 
>> 
>> 是在flink的conf文件配置hive.metastore.uris吗
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 在 2020-07-14 20:03:11,"Leonard Xu"  写道:
>>> Hello
>>> 
>>> 
>>>> 在 2020年7月14日,19:52,Zhou Zach  写道:
>>>> 
>>>> : Embedded metastore is not allowed.
>>> 
>>> Flink 集成 Hive 时,不支持 embedded metastore 的, 你需要起一个hive metastore 并在conf文件配置 
>>> hive.metastore.uris, 支持的 metastore 版本 参考[1]
>>> 
>>> Best,
>>> Leonard Xu
>>> [1] 
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/#using-bundled-hive-jar
>>>  
>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/#using-bundled-hive-jar>
>


Re:Re: flink1.11 sink hive error

2020-07-14 文章 Zhou Zach
Hi,


是在flink的conf文件配置hive.metastore.uris吗

















在 2020-07-14 20:03:11,"Leonard Xu"  写道:
>Hello
>
>
>> 在 2020年7月14日,19:52,Zhou Zach  写道:
>> 
>> : Embedded metastore is not allowed.
>
>Flink 集成 Hive 时,不支持 embedded metastore 的, 你需要起一个hive metastore 并在conf文件配置 
>hive.metastore.uris, 支持的 metastore 版本 参考[1]
>
>Best,
>Leonard Xu
>[1] 
>https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/#using-bundled-hive-jar
> 
><https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/#using-bundled-hive-jar>


flink1.11 sink hive error

2020-07-14 文章 Zhou Zach
hi all,
flink1.11 sql sink hive table 报错:


java.util.concurrent.CompletionException: 
org.apache.flink.client.deployment.application.ApplicationExecutionException: 
Could not execute application.
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
 ~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
 ~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943) 
~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
 ~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) 
~[?:1.8.0_161]
at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
 ~[?:1.8.0_161]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:245)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.lambda$runApplicationAsync$1(ApplicationDispatcherBootstrap.java:199)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_161]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_161]
at 
org.apache.flink.runtime.concurrent.akka.ActorSystemScheduledExecutorAdapter$ScheduledFutureTask.run(ActorSystemScheduledExecutorAdapter.java:154)
 [data-flow-1.0.jar:?]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) 
[qile-data-flow-1.0.jar:?]
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
 [data-flow-1.0.jar:?]
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
[qile-data-flow-1.0.jar:?]
at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
[data-flow-1.0.jar:?]
at 
akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
[data-flow-1.0.jar:?]
at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 
[data-flow-1.0.jar:?]
Caused by: 
org.apache.flink.client.deployment.application.ApplicationExecutionException: 
Could not execute application.
... 11 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: The main 
method caused an error: Embedded metastore is not allowed. Make sure you have 
set a valid value for hive.metastore.uris
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:302)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:149) 
~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:230)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
... 10 more
Caused by: java.lang.IllegalArgumentException: Embedded metastore is not 
allowed. Make sure you have set a valid value for hive.metastore.uris
at 
org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:139) 
~[data-flow-1.0.jar:?]
at 
org.apache.flink.table.catalog.hive.HiveCatalog.(HiveCatalog.java:171) 
~[flink-sql-connector-hive-2.2.0_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.table.catalog.hive.HiveCatalog.(HiveCatalog.java:157) 
~[flink-sql-connector-hive-2.2.0_2.11-1.11.0.jar:1.11.0]
at 
cn.ibobei.qile.dataflow.sql.FromKafkaSinkHiveAndHbase$.main(FromKafkaSinkHiveAndHbase.scala:27)
 ~[data-flow-1.0.jar:?]
at 
cn.ibobei.qile.dataflow.sql.FromKafkaSinkHiveAndHbase.main(FromKafkaSinkHiveAndHbase.scala)
 ~[data-flow-1.0.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_161]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_161]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_161]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_161]
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:288)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:149) 
~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 

Re:回复:Re: flink 同时sink hbase和hive,hbase少记录

2020-07-13 文章 Zhou Zach
Hi,
感谢社区热心答疑!

















在 2020-07-14 11:00:18,"夏帅"  写道:
>你好,
>本质还是StreamingFileSink,所以目前只能append
>
>
>------
>发件人:Zhou Zach 
>发送时间:2020年7月14日(星期二) 10:56
>收件人:user-zh 
>主 题:Re:Re: flink 同时sink hbase和hive,hbase少记录
>
>
>
>
>Hi Leonard,
>原来是有重复key,hbase做了upsert,请问Hive Streaming Writing是不是目前只支持append模式,不支持upsert模式
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2020-07-14 09:56:00,"Leonard Xu"  写道:
>>Hi,
>>
>>> 在 2020年7月14日,09:52,Zhou Zach  写道:
>>> 
>>>>>   |   CONCAT(SUBSTRING(MD5(CAST(uid AS VARCHAR)), 0, 6), 
>>>>> cast(CEILING(UNIX_TIMESTAMP(created_time)/60) as string), sex) as uid,
>>
>>看下这个抽取出来的rowkey是否有重复的呢?
>>
>>祝好,
>>Leonard Xu


Re:Re: flink 同时sink hbase和hive,hbase少记录

2020-07-13 文章 Zhou Zach



Hi Leonard,
原来是有重复key,hbase做了upsert,请问Hive Streaming Writing是不是目前只支持append模式,不支持upsert模式














在 2020-07-14 09:56:00,"Leonard Xu"  写道:
>Hi,
>
>> 在 2020年7月14日,09:52,Zhou Zach  写道:
>> 
>>>>   |   CONCAT(SUBSTRING(MD5(CAST(uid AS VARCHAR)), 0, 6), 
>>>> cast(CEILING(UNIX_TIMESTAMP(created_time)/60) as string), sex) as uid,
>
>看下这个抽取出来的rowkey是否有重复的呢?
>
>祝好,
>Leonard Xu


Re:Re: flink 同时sink hbase和hive,hbase少记录

2020-07-13 文章 Zhou Zach






Hi, Leonard
我设置了 'connector.write.buffer-flush.interval' = ‘1s',然后重启运行程序,
再消息发送刚开始,比如说发送了4条,hive和hbase接收的消息都是4条,再消息发送48条的时候,我停止了producer,
再去查结果hbase是19条,hive是48条,如果说每1s钟flink查一下sink hbase 
buffer是不是到1mb,到了就sink,没到就不sink,但是这解释不了,为啥刚开始,hbase和hive接收到到数据是同步的,奇怪











在 2020-07-13 21:50:54,"Leonard Xu"  写道:
>Hi, Zhou
>
>
>>   'connector.write.buffer-flush.max-size' = '1mb',
>>   'connector.write.buffer-flush.interval' = ‘0s'
>
>(1) connector.write.buffer-flush.max-size这个配置项支持的单位只有mb,其他不支持,所以会报对应的错。这个参数用于 
>BufferredMutator 
>做buffer优化的参数,表示buffer存多大的size就触发写,flush.interval参数是按照多长的时间轮询写入,两个参数根据需要配合使用。当connector.write.buffer-flush.interval
> 设置为 0s 
>时,表示不会轮询,所以只会等connector.write.buffer-flush.max-size到最大size再写入。你把connector.write.buffer-flush.interval
> 设置成 1s 应该就能看到数据了。
>
>(2) Hbase connector 1.11.0 之前的版本只支持1.4.3,所以你填2.1.0会报错,在1.11.0开始支持为1.4.x, 
>所以1.11.0新的connector里支持的参数为’connector’ = ‘hbase-1.4’, 因为hbase 
>1.4.x版本API是兼容的,另外社区也在讨论支持HBase 2.x[1]
>
>
>Best,
>Leonard Xu
>[1] 
>http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Upgrade-HBase-connector-to-2-2-x-tc42657.html#a42674
> 
><http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Upgrade-HBase-connector-to-2-2-x-tc42657.html#a42674>
>
>
>> 在 2020年7月13日,21:09,Zhou Zach  写道:
>> 
>> 
>> 
>> flink订阅kafka消息,同时sink到hbase和hive中,
>> 当向kafka发送42条记录,然后停止producer发消息,去hive中查可以精准地查到42条,但是在hbase中却只查到30条
>> 
>> 
>> query:
>> streamTableEnv.executeSql(
>>  """
>>|
>>|CREATE TABLE hbase_table (
>>|rowkey VARCHAR,
>>|cf ROW(sex VARCHAR, age INT, created_time VARCHAR)
>>|) WITH (
>>|'connector.type' = 'hbase',
>>|'connector.version' = '2.1.0',
>>|'connector.table-name' = 'ods:user_hbase6',
>>|'connector.zookeeper.quorum' = 'cdh1:2181,cdh2:2181,cdh3:2181',
>>|'connector.zookeeper.znode.parent' = '/hbase',
>>|'connector.write.buffer-flush.max-size' = '1mb',
>>|'connector.write.buffer-flush.max-rows' = '1',
>>|'connector.write.buffer-flush.interval' = '0s'
>>|)
>>|""".stripMargin)
>> 
>>val statementSet = streamTableEnv.createStatementSet()
>>val insertHbase =
>>  """
>>|insert into hbase_table
>>|SELECT
>>|   CONCAT(SUBSTRING(MD5(CAST(uid AS VARCHAR)), 0, 6), 
>> cast(CEILING(UNIX_TIMESTAMP(created_time)/60) as string), sex) as uid,
>>|   ROW(sex, age, created_time ) as cf
>>|FROM  (select uid,sex,age, cast(created_time as VARCHAR) as 
>> created_time from kafka_table)
>>|
>>|""".stripMargin
>> 
>>statementSet.addInsertSql(insertHbase)
>> 
>>val insertHive =
>>  """
>>|
>>|INSERT INTO odsCatalog.ods.hive_table
>>|SELECT uid, age, DATE_FORMAT(created_time, '-MM-dd'), 
>> DATE_FORMAT(created_time, 'HH')
>>|FROM kafka_table
>>|
>>|""".stripMargin
>>statementSet.addInsertSql(insertHive)
>> 
>> 
>>statementSet.execute()
>> 
>> 
>> 是因为参数'connector.write.buffer-flush.max-size' = 
>> '1mb'吗?我尝试设置‘0’,‘10b','1kb',都失败了,报错如下:
>> Property 'connector.write.buffer-flush.max-size' must be a memory size (in 
>> bytes) value but was: 1kb
>> Property 'connector.write.buffer-flush.max-size' must be a memory size (in 
>> bytes) value but was: 10b
>> Property 'connector.write.buffer-flush.max-size' must be a memory size (in 
>> bytes) value but was: 1
>> 
>> 
>> 
>> 
>> 
>> 
>> 并且,按照官网文档
>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/hbase.html
>> 
>> 
>> 设置参数也不识别,报错:
>> Caused by: org.apache.flink.table.api.ValidationException: Could not find 
>> any factory for identifier 'hbase-2.1.0' that implements 
>> 'org.apache.flink.table.factories.DynamicTableSinkFactory' in the classpath.
>> 
>> 
>> 看了一下源码,
>> org.apache.flink.table.descriptors.HBaseValidator
>> public static final String CONNECTOR_TYPE_VALUE_HBASE = "hbase";
>>public static final String CONNECTOR_VERSION_VALUE_143 = "2.1.0";
>>public static final String CONNECTOR_TABLE_NAME = "connector.table-name";
>>public static final String CONNECTOR_ZK_QUORUM = 
>> "connector.zookeeper.quorum";
>>public static final String CONNECTOR_ZK_NODE_PARENT = 
>> "connector.zookeeper.znode.parent";
>>public static final String CONNECTOR_WRITE_BUFFER_FLUSH_MAX_SIZE = 
>> "connector.write.buffer-flush.max-size";
>>public static final String CONNECTOR_WRITE_BUFFER_FLUSH_MAX_ROWS = 
>> "connector.write.buffer-flush.max-rows";
>>public static final String CONNECTOR_WRITE_BUFFER_FLUSH_INTERVAL = 
>> "connector.write.buffer-flush.interval";
>> 参数还是老参数
>


flink 同时sink hbase和hive,hbase少记录

2020-07-13 文章 Zhou Zach


flink订阅kafka消息,同时sink到hbase和hive中,
当向kafka发送42条记录,然后停止producer发消息,去hive中查可以精准地查到42条,但是在hbase中却只查到30条


query:
streamTableEnv.executeSql(
  """
|
|CREATE TABLE hbase_table (
|rowkey VARCHAR,
|cf ROW(sex VARCHAR, age INT, created_time VARCHAR)
|) WITH (
|'connector.type' = 'hbase',
|'connector.version' = '2.1.0',
|'connector.table-name' = 'ods:user_hbase6',
|'connector.zookeeper.quorum' = 'cdh1:2181,cdh2:2181,cdh3:2181',
|'connector.zookeeper.znode.parent' = '/hbase',
|'connector.write.buffer-flush.max-size' = '1mb',
|'connector.write.buffer-flush.max-rows' = '1',
|'connector.write.buffer-flush.interval' = '0s'
|)
|""".stripMargin)

val statementSet = streamTableEnv.createStatementSet()
val insertHbase =
  """
|insert into hbase_table
|SELECT
|   CONCAT(SUBSTRING(MD5(CAST(uid AS VARCHAR)), 0, 6), 
cast(CEILING(UNIX_TIMESTAMP(created_time)/60) as string), sex) as uid,
|   ROW(sex, age, created_time ) as cf
|FROM  (select uid,sex,age, cast(created_time as VARCHAR) as 
created_time from kafka_table)
|
|""".stripMargin

statementSet.addInsertSql(insertHbase)

val insertHive =
  """
|
|INSERT INTO odsCatalog.ods.hive_table
|SELECT uid, age, DATE_FORMAT(created_time, '-MM-dd'), 
DATE_FORMAT(created_time, 'HH')
|FROM kafka_table
|
|""".stripMargin
statementSet.addInsertSql(insertHive)


statementSet.execute()


是因为参数'connector.write.buffer-flush.max-size' = 
'1mb'吗?我尝试设置‘0’,‘10b','1kb',都失败了,报错如下:
Property 'connector.write.buffer-flush.max-size' must be a memory size (in 
bytes) value but was: 1kb
Property 'connector.write.buffer-flush.max-size' must be a memory size (in 
bytes) value but was: 10b
Property 'connector.write.buffer-flush.max-size' must be a memory size (in 
bytes) value but was: 1






并且,按照官网文档
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/hbase.html


设置参数也不识别,报错:
Caused by: org.apache.flink.table.api.ValidationException: Could not find any 
factory for identifier 'hbase-2.1.0' that implements 
'org.apache.flink.table.factories.DynamicTableSinkFactory' in the classpath.


看了一下源码,
org.apache.flink.table.descriptors.HBaseValidator
public static final String CONNECTOR_TYPE_VALUE_HBASE = "hbase";
public static final String CONNECTOR_VERSION_VALUE_143 = "2.1.0";
public static final String CONNECTOR_TABLE_NAME = "connector.table-name";
public static final String CONNECTOR_ZK_QUORUM = 
"connector.zookeeper.quorum";
public static final String CONNECTOR_ZK_NODE_PARENT = 
"connector.zookeeper.znode.parent";
public static final String CONNECTOR_WRITE_BUFFER_FLUSH_MAX_SIZE = 
"connector.write.buffer-flush.max-size";
public static final String CONNECTOR_WRITE_BUFFER_FLUSH_MAX_ROWS = 
"connector.write.buffer-flush.max-rows";
public static final String CONNECTOR_WRITE_BUFFER_FLUSH_INTERVAL = 
"connector.write.buffer-flush.interval";
参数还是老参数

Re:Re: Re: Re: 回复:Re: Re: Table options do not contain an option key 'connector' for discovering a connector.

2020-07-13 文章 Zhou Zach
好的,感谢答疑

















在 2020-07-13 19:49:10,"Jingsong Li"  写道:
>创建kafka_table需要在default dialect下。
>
>不管什么dialect,都会保存到hive metastore中 (除非使用temporary table的语法)
>
>Best,
>Jingsong
>
>On Mon, Jul 13, 2020 at 7:46 PM Zhou Zach  wrote:
>
>> 创建kafka_table的时候,是default Dialect,改成HiveCatalog时,WATERMARK 和with语法都不支持了,
>> 如果是default Dialect创建的表,是不是只是在临时会话有效
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-07-13 19:27:44,"Jingsong Li"  写道:
>> >Hi,
>> >
>> >问题一:
>> >
>> >只要current catalog是HiveCatalog。
>> >理论上Kafka也是存到HiveMetastore里面的,如果不想报错,可以用CREATE TABLE XXX IF NOT EXISTS.
>> >
>> >明确下,看不见是什么意思?可以单独试试Kafka表,重启后就不见了吗?
>> >
>> >问题二:
>> >
>> >用filesystem创建出来的是filesystem的表,它和hive
>> >metastore是没有关系的,你需要使用创建filesystem表的语法[1]。
>> >
>> >filesystem的表数据是直接写到 文件系统的,它的格式和hive是兼容的,所以写的路径是hive某张表的路径,就可以在hive端查询了。
>> >但是它的partition commit是不支持metastore的,所以不会有自动add
>> >partition到hive的默认实现,你需要自定义partition-commit-policy.
>> >
>> >[1]
>> >
>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/filesystem.html
>> >
>> >Best,
>> >Jingsong
>> >
>> >On Mon, Jul 13, 2020 at 6:51 PM Zhou Zach  wrote:
>> >
>> >> 尴尬
>> >> 我开了两个项目,改错项目了,现在 已经成功从hive查到数据了,感谢社区的热情回复,@Jingsong Li,  @夏帅
>> >> 这两天刷了Jingsong在群里的那个视频几遍了,由衷感谢!
>> >> 还有两个问题问下,
>> >> 问题1:
>> >> 创建的kafka_table,在hive和Flink
>> >>
>> SQL客户端都看不到,而且每次重新运行程序,如果不删除hive_table,就会报错,删除hive_table1,就可以执行,但是每次都不需要删除kafka_table,就可以执行程序,所以,是不是创建的kafka_table,是临时表,只有hive_table是存储在metastore
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 问题2:
>> >> 刚才有热心社区同学回答,不用hivecatalog,用filesystem connector 也是可以创建hive表,我尝试了一下,报错了:
>> >> java.util.concurrent.CompletionException:
>> >>
>> org.apache.flink.client.deployment.application.ApplicationExecutionException:
>> >> Could not execute application.
>> >> at
>> >>
>> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>> >> ~[?:1.8.0_161]
>> >> at
>> >>
>> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>> >> ~[?:1.8.0_161]
>> >> at
>> >>
>> java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943)
>> >> ~[?:1.8.0_161]
>> >> at
>> >>
>> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
>> >> ~[?:1.8.0_161]
>> >> at
>> >>
>> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>> >> ~[?:1.8.0_161]
>> >> at
>> >>
>> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>> >> ~[?:1.8.0_161]
>> >> at
>> >>
>> org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:245)
>> >> ~[flink-clients_2.11-1.11.0.jar:1.11.0]
>> >> at
>> >>
>> org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.lambda$runApplicationAsync$1(ApplicationDispatcherBootstrap.java:199)
>> >> ~[flink-clients_2.11-1.11.0.jar:1.11.0]
>> >> at
>> >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> >> [?:1.8.0_161]
>> >> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> >> [?:1.8.0_161]
>> >> at
>> >>
>> org.apache.flink.runtime.concurrent.akka.ActorSystemScheduledExecutorAdapter$ScheduledFutureTask.run(ActorSystemScheduledExecutorAdapter.java:154)
>> >> [qile-data-flow-1.0.jar:?]
>> >> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>> >> [qile-data-flow-1.0.jar:?]
>> >> at
>> >>
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
>> >> [qile-data-flow-1.0.jar:?]
>> >> at
>> >> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> >> [qile-data-flow-1.0.jar:?]
>> >> at
&

Re:Re: Re: 回复:Re: Re: Table options do not contain an option key 'connector' for discovering a connector.

2020-07-13 文章 Zhou Zach
创建kafka_table的时候,是default Dialect,改成HiveCatalog时,WATERMARK 和with语法都不支持了,
如果是default Dialect创建的表,是不是只是在临时会话有效

















在 2020-07-13 19:27:44,"Jingsong Li"  写道:
>Hi,
>
>问题一:
>
>只要current catalog是HiveCatalog。
>理论上Kafka也是存到HiveMetastore里面的,如果不想报错,可以用CREATE TABLE XXX IF NOT EXISTS.
>
>明确下,看不见是什么意思?可以单独试试Kafka表,重启后就不见了吗?
>
>问题二:
>
>用filesystem创建出来的是filesystem的表,它和hive
>metastore是没有关系的,你需要使用创建filesystem表的语法[1]。
>
>filesystem的表数据是直接写到 文件系统的,它的格式和hive是兼容的,所以写的路径是hive某张表的路径,就可以在hive端查询了。
>但是它的partition commit是不支持metastore的,所以不会有自动add
>partition到hive的默认实现,你需要自定义partition-commit-policy.
>
>[1]
>https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/filesystem.html
>
>Best,
>Jingsong
>
>On Mon, Jul 13, 2020 at 6:51 PM Zhou Zach  wrote:
>
>> 尴尬
>> 我开了两个项目,改错项目了,现在 已经成功从hive查到数据了,感谢社区的热情回复,@Jingsong Li,  @夏帅
>> 这两天刷了Jingsong在群里的那个视频几遍了,由衷感谢!
>> 还有两个问题问下,
>> 问题1:
>> 创建的kafka_table,在hive和Flink
>> SQL客户端都看不到,而且每次重新运行程序,如果不删除hive_table,就会报错,删除hive_table1,就可以执行,但是每次都不需要删除kafka_table,就可以执行程序,所以,是不是创建的kafka_table,是临时表,只有hive_table是存储在metastore
>>
>>
>>
>>
>>
>>
>> 问题2:
>> 刚才有热心社区同学回答,不用hivecatalog,用filesystem connector 也是可以创建hive表,我尝试了一下,报错了:
>> java.util.concurrent.CompletionException:
>> org.apache.flink.client.deployment.application.ApplicationExecutionException:
>> Could not execute application.
>> at
>> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>> ~[?:1.8.0_161]
>> at
>> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>> ~[?:1.8.0_161]
>> at
>> java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943)
>> ~[?:1.8.0_161]
>> at
>> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
>> ~[?:1.8.0_161]
>> at
>> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>> ~[?:1.8.0_161]
>> at
>> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>> ~[?:1.8.0_161]
>> at
>> org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:245)
>> ~[flink-clients_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.lambda$runApplicationAsync$1(ApplicationDispatcherBootstrap.java:199)
>> ~[flink-clients_2.11-1.11.0.jar:1.11.0]
>> at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> [?:1.8.0_161]
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> [?:1.8.0_161]
>> at
>> org.apache.flink.runtime.concurrent.akka.ActorSystemScheduledExecutorAdapter$ScheduledFutureTask.run(ActorSystemScheduledExecutorAdapter.java:154)
>> [qile-data-flow-1.0.jar:?]
>> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>> [qile-data-flow-1.0.jar:?]
>> at
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
>> [qile-data-flow-1.0.jar:?]
>> at
>> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> [qile-data-flow-1.0.jar:?]
>> at
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> [qile-data-flow-1.0.jar:?]
>> at
>> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> [qile-data-flow-1.0.jar:?]
>> at
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>> [qile-data-flow-1.0.jar:?]
>> Caused by:
>> org.apache.flink.client.deployment.application.ApplicationExecutionException:
>> Could not execute application.
>> ... 11 more
>> Caused by: org.apache.flink.client.program.ProgramInvocationException: The
>> main method caused an error: Unable to create a sink for writing table
>> 'default_catalog.default_database.hive_table1'.
>>
>> Table options are:
>>
>> 'connector'='filesystem'
>> 'hive.storage.file-format'='parquet'
>> 'is_generic'='false'
>> 'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00'
>> 'sink.partition-commit.delay'='0s'
>> 'sink.partition-commit.policy.kind'='metastore,success-file'
>> at
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:302)
>> ~[flink-clients_2.11-1.11.0.jar:1.11.0]
>>

Re:Re: 回复:Re: Re: Table options do not contain an option key 'connector' for discovering a connector.

2020-07-13 文章 Zhou Zach
(FromKafkaSinkHiveByFile.scala:68)
 ~[qile-data-flow-1.0.jar:?]
at 
cn.ibobei.qile.dataflow.sql.FromKafkaSinkHiveByFile.main(FromKafkaSinkHiveByFile.scala)
 ~[qile-data-flow-1.0.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_161]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_161]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_161]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_161]
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:288)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:149) 
~[flink-clients_2.11-1.11.0.jar:1.11.0]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:230)
 ~[flink-clients_2.11-1.11.0.jar:1.11.0]
... 10 more






query:




val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
streamExecutionEnv.enableCheckpointing(5 * 1000, 
CheckpointingMode.EXACTLY_ONCE)
streamExecutionEnv.getCheckpointConfig.setCheckpointTimeout(10 * 1000)

val blinkEnvSettings = 
EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv, 
blinkEnvSettings)



streamTableEnv.getConfig.setSqlDialect(SqlDialect.HIVE)
streamTableEnv.executeSql(
  """
|
|
|CREATE TABLE hive_table (
|  user_id STRING,
|  age INT
|) PARTITIONED BY (dt STRING, hr STRING) STORED AS parquet 
TBLPROPERTIES (
|  'connector'='filesystem',
|  'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
|  'sink.partition-commit.delay'='0s',
|  'sink.partition-commit.policy.kind'='metastore,success-file'
|)
|
|""".stripMargin)

streamTableEnv.getConfig.setSqlDialect(SqlDialect.DEFAULT)
streamTableEnv.executeSql(
  """
|
|CREATE TABLE kafka_table (
|uid VARCHAR,
|-- uid BIGINT,
|sex VARCHAR,
|age INT,
|created_time TIMESTAMP(3),
|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
|) WITH (
|'connector.type' = 'kafka',
|'connector.version' = 'universal',
| 'connector.topic' = 'user',
|-- 'connector.topic' = 'user_long',
|'connector.startup-mode' = 'latest-offset',
|'connector.properties.zookeeper.connect' = 
'cdh1:2181,cdh2:2181,cdh3:2181',
|'connector.properties.bootstrap.servers' = 
'cdh1:9092,cdh2:9092,cdh3:9092',
|'connector.properties.group.id' = 'user_flink',
|'format.type' = 'json',
|'format.derive-schema' = 'true'
|)
|""".stripMargin)


streamTableEnv.getConfig.setSqlDialect(SqlDialect.HIVE)

streamTableEnv.executeSql(
  """
|
|INSERT INTO hive_table
|SELECT uid, age, DATE_FORMAT(created_time, '-MM-dd'), 
DATE_FORMAT(created_time, 'HH')
|FROM kafka_table
|
|""".stripMargin)

streamTableEnv.executeSql(
  """
|
|SELECT * FROM hive_table WHERE dt='2020-07-13' and hr='18'
|
|""".stripMargin)
  .print()













在 2020-07-13 17:52:54,"Jingsong Li"  写道:
>你把完整的程序再贴下呢
>
>Best,
>Jingsong
>
>On Mon, Jul 13, 2020 at 5:46 PM Zhou Zach  wrote:
>
>> Hi,
>>
>>
>> 我现在改成了:
>> 'sink.partition-commit.delay'='0s'
>>
>>
>> checkpoint完成了20多次,hdfs文件也产生了20多个,
>> hive表还是查不到数据
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-07-13 17:23:34,"夏帅"  写道:
>>
>> 你好,
>> 你设置了1个小时的
>> SINK_PARTITION_COMMIT_DELAY
>>
>>
>> --
>> 发件人:Zhou Zach 
>> 发送时间:2020年7月13日(星期一) 17:09
>> 收件人:user-zh 
>> 主 题:Re:Re: Re: Table options do not contain an option key 'connector' for
>> discovering a connector.
>>
>>
>> 开了checkpoint,
>> val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
>>
>> streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
>> 

Re:回复:Re: Re: Table options do not contain an option key 'connector' for discovering a connector.

2020-07-13 文章 Zhou Zach
Hi,


我现在改成了:
'sink.partition-commit.delay'='0s'


checkpoint完成了20多次,hdfs文件也产生了20多个,
hive表还是查不到数据













在 2020-07-13 17:23:34,"夏帅"  写道:

你好,
你设置了1个小时的
SINK_PARTITION_COMMIT_DELAY


--
发件人:Zhou Zach 
发送时间:2020年7月13日(星期一) 17:09
收件人:user-zh 
主 题:Re:Re: Re: Table options do not contain an option key 'connector' for 
discovering a connector.


开了checkpoint,
val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
streamExecutionEnv.enableCheckpointing(5 * 1000, CheckpointingMode.EXACTLY_ONCE)
streamExecutionEnv.getCheckpointConfig.setCheckpointTimeout(10 * 1000)




间隔5s,超时10s,不过,等了2分多钟,hdfs上写入了10几个文件了,查hive还是没数据














在 2020-07-13 16:52:16,"Jingsong Li"  写道:
>有开checkpoint吧?delay设的多少?
>
>Add partition 在 checkpoint完成 + delay的时间后
>
>Best,
>Jingsong
>
>On Mon, Jul 13, 2020 at 4:50 PM Zhou Zach  wrote:
>
>> Hi,
>> 根据你的提示,加上HiveCatalog,已经成功写入数据到hdfs了,不过,为什么,直接通过hue查hive表,没数据,必须手动add
>> partition到hive表吗,我当前设置了参数
>> 'sink.partition-commit.policy.kind'='metastore'
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> At 2020-07-13 15:01:28, "Jingsong Li"  wrote:
>> >Hi,
>> >
>> >你用了HiveCatalog了吗?Hive表或Hive方言必须要结合HiveCatalog
>> >
>> >不然就只能用Filesystem connector,如果你使用filesystem也报错,那就贴下报错信息
>> >
>> >Best,
>> >Jingsong
>> >
>> >On Mon, Jul 13, 2020 at 2:58 PM Zhou Zach  wrote:
>> >
>> >> flink 1.11 sink hive table的connector设置为什么啊,尝试设置
>> >>
>> WITH('connector'='filesystem','path'='...','format'='parquet','sink.partition-commit.delay'='1
>> >> h','sink.partition-commit.policy.kind'='success-file');
>> >> 也报错误
>> >> query:
>> >> streamTableEnv.getConfig.setSqlDialect(SqlDialect.HIVE)
>> >> streamTableEnv.executeSql(
>> >> """
>> >> |
>> >> |
>> >> |CREATE TABLE hive_table (
>> >> |  user_id STRING,
>> >> |  age INT
>> >> |) PARTITIONED BY (dt STRING, hr STRING) STORED AS parquet
>> >> TBLPROPERTIES (
>> >> |  'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
>> >> |  'sink.partition-commit.trigger'='partition-time',
>> >> |  'sink.partition-commit.delay'='1 h',
>> >> |  'sink.partition-commit.policy.kind'='metastore,success-file'
>> >> |)
>> >> |
>> >> |""".stripMargin)
>> >>
>> >> streamTableEnv.getConfig.setSqlDialect(SqlDialect.DEFAULT)
>> >> streamTableEnv.executeSql(
>> >> """
>> >> |
>> >> |CREATE TABLE kafka_table (
>> >> |uid VARCHAR,
>> >> |-- uid BIGINT,
>> >> |sex VARCHAR,
>> >> |age INT,
>> >> |created_time TIMESTAMP(3),
>> >> |WATERMARK FOR created_time as created_time - INTERVAL '3'
>> SECOND
>> >> |) WITH (
>> >> |'connector.type' = 'kafka',
>> >> |'connector.version' = 'universal',
>> >> | 'connector.topic' = 'user',
>> >> |-- 'connector.topic' = 'user_long',
>> >> |'connector.startup-mode' = 'latest-offset',
>> >> |'connector.properties.zookeeper.connect' =
>> >> 'cdh1:2181,cdh2:2181,cdh3:2181',
>> >> |'connector.properties.bootstrap.servers' =
>> >> 'cdh1:9092,cdh2:9092,cdh3:9092',
>> >> |'connector.properties.group.id' = 'user_flink',
>> >> |'format.type' = 'json',
>> >> |'format.derive-schema' = 'true'
>> >> |)
>> >> |""".stripMargin)
>> >>
>> >>
>> >>
>> >> streamTableEnv.executeSql(
>> >> """
>> >> |
>> >> |INSERT INTO hive_table
>> >> |SELECT uid, age, DATE_FORMAT(created_time, '-MM-dd'),
>> >> DATE_FORMAT(created_time, 'HH')
>> >> |FROM kafka_table
>> >> |
>> >> |""".stripMargin)
>> >>
>> >> streamTableEnv.executeSql(
>> >> """
>> >> |
>> >> |SELECT * FROM hive_table WHERE d

Re:Re: Re: Table options do not contain an option key 'connector' for discovering a connector.

2020-07-13 文章 Zhou Zach
开了checkpoint,
val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
streamExecutionEnv.enableCheckpointing(5 * 1000, CheckpointingMode.EXACTLY_ONCE)
streamExecutionEnv.getCheckpointConfig.setCheckpointTimeout(10 * 1000)




间隔5s,超时10s,不过,等了2分多钟,hdfs上写入了10几个文件了,查hive还是没数据














在 2020-07-13 16:52:16,"Jingsong Li"  写道:
>有开checkpoint吧?delay设的多少?
>
>Add partition 在 checkpoint完成 + delay的时间后
>
>Best,
>Jingsong
>
>On Mon, Jul 13, 2020 at 4:50 PM Zhou Zach  wrote:
>
>> Hi,
>> 根据你的提示,加上HiveCatalog,已经成功写入数据到hdfs了,不过,为什么,直接通过hue查hive表,没数据,必须手动add
>> partition到hive表吗,我当前设置了参数
>> 'sink.partition-commit.policy.kind'='metastore'
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> At 2020-07-13 15:01:28, "Jingsong Li"  wrote:
>> >Hi,
>> >
>> >你用了HiveCatalog了吗?Hive表或Hive方言必须要结合HiveCatalog
>> >
>> >不然就只能用Filesystem connector,如果你使用filesystem也报错,那就贴下报错信息
>> >
>> >Best,
>> >Jingsong
>> >
>> >On Mon, Jul 13, 2020 at 2:58 PM Zhou Zach  wrote:
>> >
>> >> flink 1.11 sink hive table的connector设置为什么啊,尝试设置
>> >>
>> WITH('connector'='filesystem','path'='...','format'='parquet','sink.partition-commit.delay'='1
>> >> h','sink.partition-commit.policy.kind'='success-file');
>> >> 也报错误
>> >> query:
>> >> streamTableEnv.getConfig.setSqlDialect(SqlDialect.HIVE)
>> >> streamTableEnv.executeSql(
>> >> """
>> >> |
>> >> |
>> >> |CREATE TABLE hive_table (
>> >> |  user_id STRING,
>> >> |  age INT
>> >> |) PARTITIONED BY (dt STRING, hr STRING) STORED AS parquet
>> >> TBLPROPERTIES (
>> >> |  'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
>> >> |  'sink.partition-commit.trigger'='partition-time',
>> >> |  'sink.partition-commit.delay'='1 h',
>> >> |  'sink.partition-commit.policy.kind'='metastore,success-file'
>> >> |)
>> >> |
>> >> |""".stripMargin)
>> >>
>> >> streamTableEnv.getConfig.setSqlDialect(SqlDialect.DEFAULT)
>> >> streamTableEnv.executeSql(
>> >> """
>> >> |
>> >> |CREATE TABLE kafka_table (
>> >> |uid VARCHAR,
>> >> |-- uid BIGINT,
>> >> |sex VARCHAR,
>> >> |age INT,
>> >> |created_time TIMESTAMP(3),
>> >> |WATERMARK FOR created_time as created_time - INTERVAL '3'
>> SECOND
>> >> |) WITH (
>> >> |'connector.type' = 'kafka',
>> >> |'connector.version' = 'universal',
>> >> | 'connector.topic' = 'user',
>> >> |-- 'connector.topic' = 'user_long',
>> >> |'connector.startup-mode' = 'latest-offset',
>> >> |'connector.properties.zookeeper.connect' =
>> >> 'cdh1:2181,cdh2:2181,cdh3:2181',
>> >> |'connector.properties.bootstrap.servers' =
>> >> 'cdh1:9092,cdh2:9092,cdh3:9092',
>> >> |'connector.properties.group.id' = 'user_flink',
>> >> |'format.type' = 'json',
>> >> |'format.derive-schema' = 'true'
>> >> |)
>> >> |""".stripMargin)
>> >>
>> >>
>> >>
>> >> streamTableEnv.executeSql(
>> >> """
>> >> |
>> >> |INSERT INTO hive_table
>> >> |SELECT uid, age, DATE_FORMAT(created_time, '-MM-dd'),
>> >> DATE_FORMAT(created_time, 'HH')
>> >> |FROM kafka_table
>> >> |
>> >> |""".stripMargin)
>> >>
>> >> streamTableEnv.executeSql(
>> >> """
>> >> |
>> >> |SELECT * FROM hive_table WHERE dt='2020-07-13' and hr='13'
>> >> |
>> >> |""".stripMargin)
>> >> .print()
>> >> 错误栈:
>> >> Exception in thread "main"
>> org.apache.flink.table.api.ValidationException:
>> >> Unable to create a sink for writing table
>> >> 'default_catalog.default_database.hive_table'.
>> >>
>> >> Table options are:
>

Re:Re: Table options do not contain an option key 'connector' for discovering a connector.

2020-07-13 文章 Zhou Zach
Hi,
根据你的提示,加上HiveCatalog,已经成功写入数据到hdfs了,不过,为什么,直接通过hue查hive表,没数据,必须手动add 
partition到hive表吗,我当前设置了参数
'sink.partition-commit.policy.kind'='metastore'

















At 2020-07-13 15:01:28, "Jingsong Li"  wrote:
>Hi,
>
>你用了HiveCatalog了吗?Hive表或Hive方言必须要结合HiveCatalog
>
>不然就只能用Filesystem connector,如果你使用filesystem也报错,那就贴下报错信息
>
>Best,
>Jingsong
>
>On Mon, Jul 13, 2020 at 2:58 PM Zhou Zach  wrote:
>
>> flink 1.11 sink hive table的connector设置为什么啊,尝试设置
>> WITH('connector'='filesystem','path'='...','format'='parquet','sink.partition-commit.delay'='1
>> h','sink.partition-commit.policy.kind'='success-file');
>> 也报错误
>> query:
>> streamTableEnv.getConfig.setSqlDialect(SqlDialect.HIVE)
>> streamTableEnv.executeSql(
>> """
>> |
>> |
>> |CREATE TABLE hive_table (
>> |  user_id STRING,
>> |  age INT
>> |) PARTITIONED BY (dt STRING, hr STRING) STORED AS parquet
>> TBLPROPERTIES (
>> |  'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
>> |  'sink.partition-commit.trigger'='partition-time',
>> |  'sink.partition-commit.delay'='1 h',
>> |  'sink.partition-commit.policy.kind'='metastore,success-file'
>> |)
>> |
>> |""".stripMargin)
>>
>> streamTableEnv.getConfig.setSqlDialect(SqlDialect.DEFAULT)
>> streamTableEnv.executeSql(
>> """
>> |
>> |CREATE TABLE kafka_table (
>> |uid VARCHAR,
>> |-- uid BIGINT,
>> |sex VARCHAR,
>> |age INT,
>> |created_time TIMESTAMP(3),
>> |WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
>> |) WITH (
>> |'connector.type' = 'kafka',
>> |'connector.version' = 'universal',
>> | 'connector.topic' = 'user',
>> |-- 'connector.topic' = 'user_long',
>> |'connector.startup-mode' = 'latest-offset',
>> |'connector.properties.zookeeper.connect' =
>> 'cdh1:2181,cdh2:2181,cdh3:2181',
>> |'connector.properties.bootstrap.servers' =
>> 'cdh1:9092,cdh2:9092,cdh3:9092',
>> |'connector.properties.group.id' = 'user_flink',
>> |'format.type' = 'json',
>> |'format.derive-schema' = 'true'
>> |)
>> |""".stripMargin)
>>
>>
>>
>> streamTableEnv.executeSql(
>> """
>> |
>> |INSERT INTO hive_table
>> |SELECT uid, age, DATE_FORMAT(created_time, '-MM-dd'),
>> DATE_FORMAT(created_time, 'HH')
>> |FROM kafka_table
>> |
>> |""".stripMargin)
>>
>> streamTableEnv.executeSql(
>> """
>> |
>> |SELECT * FROM hive_table WHERE dt='2020-07-13' and hr='13'
>> |
>> |""".stripMargin)
>> .print()
>> 错误栈:
>> Exception in thread "main" org.apache.flink.table.api.ValidationException:
>> Unable to create a sink for writing table
>> 'default_catalog.default_database.hive_table'.
>>
>> Table options are:
>>
>> 'hive.storage.file-format'='parquet'
>> 'is_generic'='false'
>> 'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00'
>> 'sink.partition-commit.delay'='1 h'
>> 'sink.partition-commit.policy.kind'='metastore,success-file'
>> 'sink.partition-commit.trigger'='partition-time'
>> at
>> org.apache.flink.table.factories.FactoryUtil.createTableSink(FactoryUtil.java:164)
>> at
>> org.apache.flink.table.planner.delegation.PlannerBase.getTableSink(PlannerBase.scala:344)
>> at
>> org.apache.flink.table.planner.delegation.PlannerBase.translateToRel(PlannerBase.scala:204)
>> at
>> org.apache.flink.table.planner.delegation.PlannerBase$$anonfun$1.apply(PlannerBase.scala:163)
>> at
>> org.apache.flink.table.planner.delegation.PlannerBase$$anonfun$1.apply(PlannerBase.scala:163)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>> at
>> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>> at
>> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
&

Table options do not contain an option key 'connector' for discovering a connector.

2020-07-13 文章 Zhou Zach
flink 1.11 sink hive table的connector设置为什么啊,尝试设置
WITH('connector'='filesystem','path'='...','format'='parquet','sink.partition-commit.delay'='1
 h','sink.partition-commit.policy.kind'='success-file');
也报错误
query:
streamTableEnv.getConfig.setSqlDialect(SqlDialect.HIVE)
streamTableEnv.executeSql(
"""
|
|
|CREATE TABLE hive_table (
|  user_id STRING,
|  age INT
|) PARTITIONED BY (dt STRING, hr STRING) STORED AS parquet TBLPROPERTIES (
|  'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
|  'sink.partition-commit.trigger'='partition-time',
|  'sink.partition-commit.delay'='1 h',
|  'sink.partition-commit.policy.kind'='metastore,success-file'
|)
|
|""".stripMargin)

streamTableEnv.getConfig.setSqlDialect(SqlDialect.DEFAULT)
streamTableEnv.executeSql(
"""
|
|CREATE TABLE kafka_table (
|uid VARCHAR,
|-- uid BIGINT,
|sex VARCHAR,
|age INT,
|created_time TIMESTAMP(3),
|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
|) WITH (
|'connector.type' = 'kafka',
|'connector.version' = 'universal',
| 'connector.topic' = 'user',
|-- 'connector.topic' = 'user_long',
|'connector.startup-mode' = 'latest-offset',
|'connector.properties.zookeeper.connect' = 
'cdh1:2181,cdh2:2181,cdh3:2181',
|'connector.properties.bootstrap.servers' = 
'cdh1:9092,cdh2:9092,cdh3:9092',
|'connector.properties.group.id' = 'user_flink',
|'format.type' = 'json',
|'format.derive-schema' = 'true'
|)
|""".stripMargin)



streamTableEnv.executeSql(
"""
|
|INSERT INTO hive_table
|SELECT uid, age, DATE_FORMAT(created_time, '-MM-dd'), 
DATE_FORMAT(created_time, 'HH')
|FROM kafka_table
|
|""".stripMargin)

streamTableEnv.executeSql(
"""
|
|SELECT * FROM hive_table WHERE dt='2020-07-13' and hr='13'
|
|""".stripMargin)
.print()
错误栈:
Exception in thread "main" org.apache.flink.table.api.ValidationException: 
Unable to create a sink for writing table 
'default_catalog.default_database.hive_table'.

Table options are:

'hive.storage.file-format'='parquet'
'is_generic'='false'
'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00'
'sink.partition-commit.delay'='1 h'
'sink.partition-commit.policy.kind'='metastore,success-file'
'sink.partition-commit.trigger'='partition-time'
at 
org.apache.flink.table.factories.FactoryUtil.createTableSink(FactoryUtil.java:164)
at 
org.apache.flink.table.planner.delegation.PlannerBase.getTableSink(PlannerBase.scala:344)
at 
org.apache.flink.table.planner.delegation.PlannerBase.translateToRel(PlannerBase.scala:204)
at 
org.apache.flink.table.planner.delegation.PlannerBase$$anonfun$1.apply(PlannerBase.scala:163)
at 
org.apache.flink.table.planner.delegation.PlannerBase$$anonfun$1.apply(PlannerBase.scala:163)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:163)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1248)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:694)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:781)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:684)
at org.rabbit.sql.FromKafkaSinkHive$.main(FromKafkaSinkHive.scala:65)
at org.rabbit.sql.FromKafkaSinkHive.main(FromKafkaSinkHive.scala)
Caused by: org.apache.flink.table.api.ValidationException: Table options do not 
contain an option key 'connector' for discovering a connector.
at 
org.apache.flink.table.factories.FactoryUtil.getDynamicTableFactory(FactoryUtil.java:321)
at 
org.apache.flink.table.factories.FactoryUtil.createTableSink(FactoryUtil.java:157)
... 19 more



Re:how to set table.sql-dialect in flink1.11 StreamTableEnvironment

2020-07-13 文章 Zhou Zach
找到了:
tableEnv.getConfig().setSqlDialect(SqlDialect.HIVE);

















在 2020-07-13 14:01:45,"Zhou Zach"  写道:
>hi all,
>
>
>我像下面那种方式尝试,报错了
>
>
>streamTableEnv.executeSql(
>"""
>|
>|
>|SET table.sql-dialect=hive;
>|CREATE TABLE hive_table (
>|  user_id STRING,
>|  age INT
>|) PARTITIONED BY (dt STRING, hr STRING) STORED AS parquet TBLPROPERTIES (
>|  'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
>|  'sink.partition-commit.trigger'='partition-time',
>|  'sink.partition-commit.delay'='1 h',
>|  'sink.partition-commit.policy.kind'='metastore,success-file'
>|)
>|
>|""".stripMargin)
>
>
>错误栈:
>Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL 
>parse failed. Encountered "table" at line 4, column 5.
>Was expecting one of:
> ...
> ...
> ...
> ...
> ...
>
>   at 
> org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:56)
>   at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:76)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:678)
>


how to set table.sql-dialect in flink1.11 StreamTableEnvironment

2020-07-13 文章 Zhou Zach
hi all,


我像下面那种方式尝试,报错了


streamTableEnv.executeSql(
"""
|
|
|SET table.sql-dialect=hive;
|CREATE TABLE hive_table (
|  user_id STRING,
|  age INT
|) PARTITIONED BY (dt STRING, hr STRING) STORED AS parquet TBLPROPERTIES (
|  'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
|  'sink.partition-commit.trigger'='partition-time',
|  'sink.partition-commit.delay'='1 h',
|  'sink.partition-commit.policy.kind'='metastore,success-file'
|)
|
|""".stripMargin)


错误栈:
Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL 
parse failed. Encountered "table" at line 4, column 5.
Was expecting one of:
 ...
 ...
 ...
 ...
 ...

at 
org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:56)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:76)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:678)



Re:Re: flink1.10升级到flink1.11 提交到yarn失败

2020-07-10 文章 Zhou Zach
Hello,Leonard报的错误是Could not find a suitable table factory for 
'org.apache.flink.table.factories.TableSinkFactory' in the classpath.




不过,根据你的提示,我下载了flink-connector-jdbc_2.11-1.11.0.jar,放到了/opt/flink-1.11.0/lib/,作业成功运行了!早上跑的第一个作业,也是类似原因,下载了hbase
 connector就好了,感谢答疑问!











在 2020-07-10 11:31:39,"Leonard Xu"  写道:
>Hello,Zach
>
>>>> Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException:
>>>> Could not find a suitable table factory for
>>>> 'org.apache.flink.table.factories.TableSourceFactory' in
>>>> the classpath.
>>>> 
>>>> 
>>>> Reason: Required context properties mismatch.
>这个错误,一般是SQL 程序缺少了SQL connector 或 format的依赖,你pom里下面的这两个依赖,
>
>  
>   org.apache.flink
>   flink-sql-connector-kafka_2.11
>   ${flink.version}
>   
>   
>   org.apache.flink
>   flink-connector-kafka_2.11
>   ${flink.version}
>   
>
>放在一起是会冲突的,flink-sql-connector-kafka_2.11 shaded 了kafka的依赖, 
>flink-connector-kafka_2.11 是没有shade的。
>你根据你的需要,如果是SQL 程序用第一个, 如果是 dataStream 作业 使用第二个。
>
>祝好,
>Leonard Xu
>
>
>> 在 2020年7月10日,11:08,Shuiqiang Chen  写道:
>> 
>> Hi,
>> 看样子是kafka table source没有成功创建,也许你需要将
>> 
>>org.apache.flink
>>flink-sql-connector-kafka_2.11
>>${flink.version}
>> 
>> 
>> 这个jar 放到 FLINK_HOME/lib 目录下
>> 
>> Congxian Qiu  于2020年7月10日周五 上午10:57写道:
>> 
>>> Hi
>>> 
>>> 从异常看,可能是某个包没有引入导致的,和这个[1]比较像,可能你需要对比一下需要的是哪个包没有引入。
>>> 
>>> PS 从栈那里看到是 csv 相关的,可以优先考虑下 cvs 相关的包
>>> 
>>> ```
>>> The following factories have been considered:
>>> org.apache.flink.table.sources.CsvBatchTableSourceFactory
>>> org.apache.flink.table.sources.CsvAppendTableSourceFactory
>>> org.apache.flink.table.filesystem.FileSystemTableFactory
>>> at
>>> 
>>> org.apache.flink.table.factories.TableFactoryService.filterByContext(TableFactoryService.java:322)
>>> at
>>> 
>>> org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:190)
>>> at
>>> 
>>> org.apache.flink.table.factories.TableFactoryService.findSingleInternal(TableFactoryService.java:143)
>>> at
>>> 
>>> org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.java:96)
>>> at
>>> 
>>> org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:46)
>>> ... 37 more
>>> ```
>>> 
>>> [1] http://apache-flink.147419.n8.nabble.com/flink-1-11-td4471.html
>>> Best,
>>> Congxian
>>> 
>>> 
>>> Zhou Zach  于2020年7月10日周五 上午10:39写道:
>>> 
>>>> 日志贴全了的,这是从yarn ui贴的full log,用yarn logs命令也是这些log,太简短,看不出错误在哪。。。
>>>> 
>>>> 
>>>> 我又提交了另外之前用flink1.10跑过的任务,现在用flink1.11跑,报了异常:
>>>> 
>>>> 
>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>> SLF4J: Found binding in
>>>> 
>>> [jar:file:/opt/flink-1.11.0/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: Found binding in
>>>> 
>>> [jar:file:/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/jars/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>>> explanation.
>>>> SLF4J: Actual binding is of type
>>>> [org.apache.logging.slf4j.Log4jLoggerFactory]
>>>> 
>>>> 
>>>> 
>>>> The program finished with the following exception:
>>>> 
>>>> 
>>>> org.apache.flink.client.program.ProgramInvocationException: The main
>>>> method caused an error: findAndCreateTableSource failed.
>>>> at
>>>> 
>>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:302)
>>>> at
>>>> 
>>> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
>>>> at
>>> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:149)
>>>> at
>>>> 
>>> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:699)
>>>> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.

Re:Re: flink1.10升级到flink1.11 提交到yarn失败

2020-07-09 文章 Zhou Zach




com.jayway.jsonpath
json-path
2.4.0




org.apache.flink
flink-connector-jdbc_2.11
${flink.version}


mysql
mysql-connector-java
5.1.46


io.vertx
vertx-core
3.9.1


io.vertx
vertx-jdbc-client
3.9.1










集群节点flink-1.11.0/lib/:
-rw-r--r-- 1 root root197597 6月  30 10:28 flink-clients_2.11-1.11.0.jar
-rw-r--r-- 1 root root 90782 6月  30 17:46 flink-csv-1.11.0.jar
-rw-r--r-- 1 root root 108349203 6月  30 17:52 flink-dist_2.11-1.11.0.jar
-rw-r--r-- 1 root root 94863 6月  30 17:45 flink-json-1.11.0.jar
-rw-r--r-- 1 root root   7712156 6月  18 10:42 flink-shaded-zookeeper-3.4.14.jar
-rw-r--r-- 1 root root  33325754 6月  30 17:50 flink-table_2.11-1.11.0.jar
-rw-r--r-- 1 root root 47333 6月  30 10:38 
flink-table-api-scala-bridge_2.11-1.11.0.jar
-rw-r--r-- 1 root root  37330521 6月  30 17:50 flink-table-blink_2.11-1.11.0.jar
-rw-r--r-- 1 root root754983 6月  30 12:29 flink-table-common-1.11.0.jar
-rw-r--r-- 1 root root 67114 4月  20 20:47 log4j-1.2-api-2.12.1.jar
-rw-r--r-- 1 root root276771 4月  20 20:47 log4j-api-2.12.1.jar
-rw-r--r-- 1 root root   1674433 4月  20 20:47 log4j-core-2.12.1.jar
-rw-r--r-- 1 root root 23518 4月  20 20:47 log4j-slf4j-impl-2.12.1.jar


把table相关的包都下载下来了,还是报同样的错,好奇怪。。。

















在 2020-07-10 10:24:02,"Congxian Qiu"  写道:
>Hi
>
>这个看上去是提交到 Yarn 了,具体的原因需要看下 JM log 是啥原因。另外是否是日志没有贴全,这里只看到本地 log,其他的就只有小部分
>jobmanager.err 的 log。
>
>Best,
>Congxian
>
>
>Zhou Zach  于2020年7月9日周四 下午9:23写道:
>
>> hi all,
>> 原来用1.10使用per job模式,可以提交的作业,现在用1.11使用应用模式提交失败,看日志,也不清楚原因,
>> yarn log:
>> Log Type: jobmanager.err
>>
>>
>> Log Upload Time: Thu Jul 09 21:02:48 +0800 2020
>>
>>
>> Log Length: 785
>>
>>
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/yarn/nm/usercache/hdfs/appcache/application_1594271580406_0010/filecache/11/data-flow-1.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/jars/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> log4j:WARN No appenders could be found for logger
>> (org.apache.flink.runtime.entrypoint.ClusterEntrypoint).
>> log4j:WARN Please initialize the log4j system properly.
>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
>> more info.
>>
>>
>> Log Type: jobmanager.out
>>
>>
>> Log Upload Time: Thu Jul 09 21:02:48 +0800 2020
>>
>>
>> Log Length: 0
>>
>>
>>
>>
>> Log Type: prelaunch.err
>>
>>
>> Log Upload Time: Thu Jul 09 21:02:48 +0800 2020
>>
>>
>> Log Length: 0
>>
>>
>>
>>
>> Log Type: prelaunch.out
>>
>>
>> Log Upload Time: Thu Jul 09 21:02:48 +0800 2020
>>
>>
>> Log Length: 70
>>
>>
>> Setting up env variables
>> Setting up job resources
>> Launching container
>>
>>
>>
>>
>>
>>
>>
>>
>> 本地log:
>> 2020-07-09 21:02:41,015 INFO  org.apache.flink.client.cli.CliFrontend
>> [] -
>> 
>> 2020-07-09 21:02:41,020 INFO
>> org.apache.flink.configuration.GlobalConfiguration   [] - Loading
>> configuration property: jobmanager.rpc.address, localhost
>> 2020-07-09 21:02:41,020 INFO
>> org.apache.flink.configuration.GlobalConfiguration   [] - Loading
>> configuration property: jobmanager.rpc.port, 6123
>> 2020-07-09 21:02:41,021 INFO
>> org.apache.flink.configuration.GlobalConfiguration   [] - Loading
>> configuration property: jobmanager.memory.process.size, 1600m
>> 2020-07-09 21:02:41,021 INFO
>> org.apache.flink.configuration.GlobalConfiguration   [] - Loading
>> configuration property: taskmanager.memory.process.size, 1728m
>> 2020-07-09 21:02:41,021 INFO
>> org.apache.flink.configuration.GlobalConfiguration   [] - Loading
>> configuration property: taskmanager.numberOfTaskSlots, 1
>> 2020-07-09 21:02:41,021 INFO
>> org.apache.flink.configuration.GlobalConfiguration   [] - Loading
>> configuration property: 

flink1.10升级到flink1.11 提交到yarn失败

2020-07-09 文章 Zhou Zach
hi all,
原来用1.10使用per job模式,可以提交的作业,现在用1.11使用应用模式提交失败,看日志,也不清楚原因,
yarn log:
Log Type: jobmanager.err


Log Upload Time: Thu Jul 09 21:02:48 +0800 2020


Log Length: 785


SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/yarn/nm/usercache/hdfs/appcache/application_1594271580406_0010/filecache/11/data-flow-1.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/jars/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger 
(org.apache.flink.runtime.entrypoint.ClusterEntrypoint).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.


Log Type: jobmanager.out


Log Upload Time: Thu Jul 09 21:02:48 +0800 2020


Log Length: 0




Log Type: prelaunch.err


Log Upload Time: Thu Jul 09 21:02:48 +0800 2020


Log Length: 0




Log Type: prelaunch.out


Log Upload Time: Thu Jul 09 21:02:48 +0800 2020


Log Length: 70


Setting up env variables
Setting up job resources
Launching container








本地log:
2020-07-09 21:02:41,015 INFO  org.apache.flink.client.cli.CliFrontend   
   [] - 

2020-07-09 21:02:41,020 INFO  
org.apache.flink.configuration.GlobalConfiguration   [] - Loading 
configuration property: jobmanager.rpc.address, localhost
2020-07-09 21:02:41,020 INFO  
org.apache.flink.configuration.GlobalConfiguration   [] - Loading 
configuration property: jobmanager.rpc.port, 6123
2020-07-09 21:02:41,021 INFO  
org.apache.flink.configuration.GlobalConfiguration   [] - Loading 
configuration property: jobmanager.memory.process.size, 1600m
2020-07-09 21:02:41,021 INFO  
org.apache.flink.configuration.GlobalConfiguration   [] - Loading 
configuration property: taskmanager.memory.process.size, 1728m
2020-07-09 21:02:41,021 INFO  
org.apache.flink.configuration.GlobalConfiguration   [] - Loading 
configuration property: taskmanager.numberOfTaskSlots, 1
2020-07-09 21:02:41,021 INFO  
org.apache.flink.configuration.GlobalConfiguration   [] - Loading 
configuration property: parallelism.default, 1
2020-07-09 21:02:41,021 INFO  
org.apache.flink.configuration.GlobalConfiguration   [] - Loading 
configuration property: jobmanager.execution.failover-strategy, region
2020-07-09 21:02:41,164 INFO  
org.apache.flink.runtime.security.modules.HadoopModule   [] - Hadoop user 
set to hdfs (auth:SIMPLE)
2020-07-09 21:02:41,172 INFO  
org.apache.flink.runtime.security.modules.JaasModule [] - Jaas file 
will be created as /tmp/jaas-2213111423022415421.conf.
2020-07-09 21:02:41,181 INFO  org.apache.flink.client.cli.CliFrontend   
   [] - Running 'run-application' command.
2020-07-09 21:02:41,194 INFO  
org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer 
[] - Submitting application in 'Application Mode'.
2020-07-09 21:02:41,201 WARN  
org.apache.flink.yarn.configuration.YarnLogConfigUtil[] - The 
configuration directory ('/opt/flink-1.11.0/conf') already contains a LOG4J 
config file.If you want to use logback, then please delete or rename the log 
configuration file.
2020-07-09 21:02:41,537 INFO  org.apache.flink.yarn.YarnClusterDescriptor   
   [] - No path for the flink jar passed. Using the location of class 
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2020-07-09 21:02:41,665 INFO  
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider [] - Failing 
over to rm220
2020-07-09 21:02:41,717 INFO  org.apache.hadoop.conf.Configuration  
   [] - resource-types.xml not found
2020-07-09 21:02:41,718 INFO  
org.apache.hadoop.yarn.util.resource.ResourceUtils   [] - Unable to 
find 'resource-types.xml'.
2020-07-09 21:02:41,755 INFO  org.apache.flink.yarn.YarnClusterDescriptor   
   [] - Cluster specification: 
ClusterSpecification{masterMemoryMB=2048, taskManagerMemoryMB=4096, 
slotsPerTaskManager=1}
2020-07-09 21:02:42,723 INFO  org.apache.flink.yarn.YarnClusterDescriptor   
   [] - Submitting application master application_1594271580406_0010
2020-07-09 21:02:42,969 INFO  
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl[] - Submitted 
application application_1594271580406_0010
2020-07-09 21:02:42,969 INFO  org.apache.flink.yarn.YarnClusterDescriptor   
   [] - Waiting for the cluster to be allocated
2020-07-09 21:02:42,971 INFO  org.apache.flink.yarn.YarnClusterDescriptor   
   [] - Deploying cluster, current state ACCEPTED
2020-07-09 21:02:47,619 INFO  org.apache.flink.yarn.YarnClusterDescriptor 

Re:Re: 回复:flink Sql 1.11 executeSql报No operators defined in streaming topology

2020-07-08 文章 Zhou Zach
去掉就好了,感谢解答

















在 2020-07-08 16:07:17,"Jingsong Li"  写道:
>Hi,
>
>你的代码里:streamTableEnv.executeSql,它的意思就是已经提交到集群异步的去执行了。
>
>所以你后面 "streamExecutionEnv.execute("from kafka sink hbase")"
>并没有真正的物理节点。你不用再调用了。
>
>Best,
>Jingsong
>
>On Wed, Jul 8, 2020 at 3:56 PM Zhou Zach  wrote:
>
>>
>>
>>
>> 代码结构改成这样的了:
>>
>>
>>
>>
>> val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
>>
>> val blinkEnvSettings =
>> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
>>
>> val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv,
>> blinkEnvSettings)
>>
>>
>>
>>
>>
>> streamExecutionEnv.execute("from kafka sink hbase")
>>
>>
>>
>>
>> 还是报一样的错
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-07-08 15:40:41,"夏帅"  写道:
>> >你好,
>> >可以看看你的代码结构是不是以下这种
>> >val bsEnv = StreamExecutionEnvironment.getExecutionEnvironment
>> >val bsSettings =
>> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build
>> >val tableEnv = StreamTableEnvironment.create(bsEnv, bsSettings)
>> >  ..
>> >tableEnv.execute("")
>> >如果是的话,可以尝试使用bsEnv.execute("")
>> >1.11对于两者的execute代码实现有改动
>> >
>> >
>> >--
>> >发件人:Zhou Zach 
>> >发送时间:2020年7月8日(星期三) 15:30
>> >收件人:Flink user-zh mailing list 
>> >主 题:flink Sql 1.11 executeSql报No operators defined in streaming topology
>> >
>> >代码在flink
>> 1.10.1是可以正常运行的,升级到1.11.0时,提示streamTableEnv.sqlUpdate弃用,改成executeSql了,程序启动2秒后,报异常:
>> >Exception in thread "main" java.lang.IllegalStateException: No operators
>> defined in streaming topology. Cannot generate StreamGraph.
>> >at
>> org.apache.flink.table.planner.utils.ExecutorUtils.generateStreamGraph(ExecutorUtils.java:47)
>> >at
>> org.apache.flink.table.planner.delegation.StreamExecutor.createPipeline(StreamExecutor.java:47)
>> >at
>> org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:1197)
>> >at org.rabbit.sql.FromKafkaSinkHbase$.main(FromKafkaSinkHbase.scala:79)
>> >at org.rabbit.sql.FromKafkaSinkHbase.main(FromKafkaSinkHbase.scala)
>> >
>> >
>> >但是,数据是正常sink到了hbase,是不是executeSql误报了。。。
>> >
>> >
>> >
>> >
>> >query:
>> >streamTableEnv.executeSql(
>> >  """
>> >|
>> >|CREATE TABLE `user` (
>> >|uid BIGINT,
>> >|sex VARCHAR,
>> >|age INT,
>> >|created_time TIMESTAMP(3),
>> >|WATERMARK FOR created_time as created_time - INTERVAL '3'
>> SECOND
>> >|) WITH (
>> >|'connector.type' = 'kafka',
>> >|'connector.version' = 'universal',
>> >|-- 'connector.topic' = 'user',
>> >|'connector.topic' = 'user_long',
>> >|'connector.startup-mode' = 'latest-offset',
>> >|'connector.properties.group.id' = 'user_flink',
>> >|'format.type' = 'json',
>> >|'format.derive-schema' = 'true'
>> >|)
>> >|""".stripMargin)
>> >
>> >
>> >
>> >
>> >
>> >
>> >streamTableEnv.executeSql(
>> >  """
>> >|
>> >|CREATE TABLE user_hbase3(
>> >|rowkey BIGINT,
>> >|cf ROW(sex VARCHAR, age INT, created_time VARCHAR)
>> >|) WITH (
>> >|'connector.type' = 'hbase',
>> >|'connector.version' = '2.1.0',
>> >|'connector.table-name' = 'user_hbase2',
>> >|'connector.zookeeper.znode.parent' = '/hbase',
>> >|'connector.write.buffer-flush.max-size' = '10mb',
>> >|'connector.write.buffer-flush.max-rows' = '1000',
>> >|'connector.write.buffer-flush.interval' = '2s'
>> >|)
>> >|""".stripMargin)
>> >
>> >
>> >streamTableEnv.executeSql(
>> >  """
>> >|
>> >|insert into user_hbase3
>> >|SELECT uid,
>> >|
>> >|  ROW(sex, age, created_time ) as cf
>> >|  FROM  (select uid,sex,age, cast(created_time as VARCHAR) as
>> created_time from `user`)
>> >|
>> >|""".stripMargin)
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>
>
>-- 
>Best, Jingsong Lee


Re:回复:flink Sql 1.11 executeSql报No operators defined in streaming topology

2020-07-08 文章 Zhou Zach



代码结构改成这样的了:




val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment

val blinkEnvSettings = 
EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()

val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv, 
blinkEnvSettings)





streamExecutionEnv.execute("from kafka sink hbase")




还是报一样的错











在 2020-07-08 15:40:41,"夏帅"  写道:
>你好,
>可以看看你的代码结构是不是以下这种
>val bsEnv = StreamExecutionEnvironment.getExecutionEnvironment
>val bsSettings = 
> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build
>val tableEnv = StreamTableEnvironment.create(bsEnv, bsSettings)
>  ..
>tableEnv.execute("")
>如果是的话,可以尝试使用bsEnv.execute("")
>1.11对于两者的execute代码实现有改动
>
>
>--
>发件人:Zhou Zach 
>发送时间:2020年7月8日(星期三) 15:30
>收件人:Flink user-zh mailing list 
>主 题:flink Sql 1.11 executeSql报No operators defined in streaming topology
>
>代码在flink 
>1.10.1是可以正常运行的,升级到1.11.0时,提示streamTableEnv.sqlUpdate弃用,改成executeSql了,程序启动2秒后,报异常:
>Exception in thread "main" java.lang.IllegalStateException: No operators 
>defined in streaming topology. Cannot generate StreamGraph.
>at 
>org.apache.flink.table.planner.utils.ExecutorUtils.generateStreamGraph(ExecutorUtils.java:47)
>at 
>org.apache.flink.table.planner.delegation.StreamExecutor.createPipeline(StreamExecutor.java:47)
>at 
>org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:1197)
>at org.rabbit.sql.FromKafkaSinkHbase$.main(FromKafkaSinkHbase.scala:79)
>at org.rabbit.sql.FromKafkaSinkHbase.main(FromKafkaSinkHbase.scala)
>
>
>但是,数据是正常sink到了hbase,是不是executeSql误报了。。。
>
>
>
>
>query:
>streamTableEnv.executeSql(
>  """
>|
>|CREATE TABLE `user` (
>|uid BIGINT,
>|sex VARCHAR,
>|age INT,
>|created_time TIMESTAMP(3),
>|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
>|) WITH (
>|'connector.type' = 'kafka',
>|'connector.version' = 'universal',
>|-- 'connector.topic' = 'user',
>|'connector.topic' = 'user_long',
>|'connector.startup-mode' = 'latest-offset',
>|'connector.properties.group.id' = 'user_flink',
>|'format.type' = 'json',
>|'format.derive-schema' = 'true'
>|)
>|""".stripMargin)
>
>
>
>
>
>
>streamTableEnv.executeSql(
>  """
>|
>|CREATE TABLE user_hbase3(
>|rowkey BIGINT,
>|cf ROW(sex VARCHAR, age INT, created_time VARCHAR)
>|) WITH (
>|'connector.type' = 'hbase',
>|'connector.version' = '2.1.0',
>|'connector.table-name' = 'user_hbase2',
>|'connector.zookeeper.znode.parent' = '/hbase',
>|'connector.write.buffer-flush.max-size' = '10mb',
>|'connector.write.buffer-flush.max-rows' = '1000',
>|'connector.write.buffer-flush.interval' = '2s'
>|)
>|""".stripMargin)
>
>
>streamTableEnv.executeSql(
>  """
>|
>|insert into user_hbase3
>|SELECT uid,
>|
>|  ROW(sex, age, created_time ) as cf
>|  FROM  (select uid,sex,age, cast(created_time as VARCHAR) as 
> created_time from `user`)
>|
>|""".stripMargin)
>
>
>
>
>
>
>
>


flink Sql 1.11 executeSql报No operators defined in streaming topology

2020-07-08 文章 Zhou Zach
代码在flink 
1.10.1是可以正常运行的,升级到1.11.0时,提示streamTableEnv.sqlUpdate弃用,改成executeSql了,程序启动2秒后,报异常:
Exception in thread "main" java.lang.IllegalStateException: No operators 
defined in streaming topology. Cannot generate StreamGraph.
at 
org.apache.flink.table.planner.utils.ExecutorUtils.generateStreamGraph(ExecutorUtils.java:47)
at 
org.apache.flink.table.planner.delegation.StreamExecutor.createPipeline(StreamExecutor.java:47)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:1197)
at org.rabbit.sql.FromKafkaSinkHbase$.main(FromKafkaSinkHbase.scala:79)
at org.rabbit.sql.FromKafkaSinkHbase.main(FromKafkaSinkHbase.scala)


但是,数据是正常sink到了hbase,是不是executeSql误报了。。。




query:
streamTableEnv.executeSql(
  """
|
|CREATE TABLE `user` (
|uid BIGINT,
|sex VARCHAR,
|age INT,
|created_time TIMESTAMP(3),
|WATERMARK FOR created_time as created_time - INTERVAL '3' SECOND
|) WITH (
|'connector.type' = 'kafka',
|'connector.version' = 'universal',
|-- 'connector.topic' = 'user',
|'connector.topic' = 'user_long',
|'connector.startup-mode' = 'latest-offset',
|'connector.properties.group.id' = 'user_flink',
|'format.type' = 'json',
|'format.derive-schema' = 'true'
|)
|""".stripMargin)






streamTableEnv.executeSql(
  """
|
|CREATE TABLE user_hbase3(
|rowkey BIGINT,
|cf ROW(sex VARCHAR, age INT, created_time VARCHAR)
|) WITH (
|'connector.type' = 'hbase',
|'connector.version' = '2.1.0',
|'connector.table-name' = 'user_hbase2',
|'connector.zookeeper.znode.parent' = '/hbase',
|'connector.write.buffer-flush.max-size' = '10mb',
|'connector.write.buffer-flush.max-rows' = '1000',
|'connector.write.buffer-flush.interval' = '2s'
|)
|""".stripMargin)


streamTableEnv.executeSql(
  """
|
|insert into user_hbase3
|SELECT uid,
|
|  ROW(sex, age, created_time ) as cf
|  FROM  (select uid,sex,age, cast(created_time as VARCHAR) as 
created_time from `user`)
|
|""".stripMargin)









Re:Re: flink 1.11 connector jdbc 依赖解析失败

2020-07-08 文章 Zhou Zach
感谢提醒,




我是在https://mvnrepository.com/这个上面搜没搜到对应的包的,不过,module名改成flink-connector-jdbc,可以了,感谢提醒








在 2020-07-08 09:35:10,"Leonard Xu"  写道:
>Hello,
>
>我看下了maven仓库里有的[1], 官网文档里也有下载链接[2],是不是pom里的依赖没有写对?1.11 jdbc connector 的module名从 
>flink-jdbc 规范到了 flink-connector-jdbc。
>
>祝好,
>Leonard Xu
>
>[1] 
>https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-jdbc_2.11/1.11.0/
> 
><https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-jdbc_2.11/1.11.0/>
>[2] 
>https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/jdbc.html
> 
><https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/jdbc.html>
>
>
>> 在 2020年7月8日,08:15,Zhou Zach  写道:
>> 
>> hi all,
>> flink升级到1.11,flink-connector-jdbc 
>> idea解析失败,去maven仓库查也没查到,请问是不是要手动编译1.11的源码的方式安装依赖的
>> 
>


flink 1.11 connector jdbc 依赖解析失败

2020-07-07 文章 Zhou Zach
hi all,
flink升级到1.11,flink-connector-jdbc 
idea解析失败,去maven仓库查也没查到,请问是不是要手动编译1.11的源码的方式安装依赖的



flink cep result DataStream no data print

2020-07-05 文章 Zhou Zach
code:


val inpurtDS = 
streamTableEnv.toAppendStream[BehaviorInfo](behaviorTable)inpurtDS.print()val 
pattern = Pattern.begin[BehaviorInfo]("start")
  .where(_.clickCount  7)val patternStream = CEP.pattern(inpurtDS, pattern)
val result: DataStream[BehaviorInfo] = patternStream.process(
  new PatternProcessFunction[BehaviorInfo, BehaviorInfo]() {
override def processMatch(
   matchPattern: util.Map[String, 
util.List[BehaviorInfo]],
   ctx: PatternProcessFunction.Context,
   out: Collector[BehaviorInfo]): Unit = {
  try {
println(
  s"""
 |matchPattern: $matchPattern
 |util.List[BehaviorInfo]: ${matchPattern.get("start")}
 |""".stripMargin)
out.collect(matchPattern.get("start").get(0))
  } catch {
case exception: Exception =
  println(exception)
  }
}
  })
result.print()



??inpurtDS.print()pattern??result.print()PatternProcessFunctionprocessMatch??


Thanks a lot!

Re:Re: Re:Re:Re: Re: Re: flink run from checkpoit failed

2020-06-22 文章 Zhou Zach
https://issues.apache.org/jira/browse/FLINK-10636
看到这个issues说这个问题是Kafka 0.8的问题,我现在用的kafka是2.2.1+cdh6.3.2,这个kafka版本也有问题吗

















在 2020-06-22 15:16:14,"Congxian Qiu"  写道:
>1 首先,-s 后面跟的参数可以是 savepoint 也可以是 checkpoint path,从 retain checkpoint
>恢复就是这么启动的[1]
>2 从你的发的日志看,里面有一些认证相关的问题 `2020-06-22 13:00:59,368 ERROR
>org.apache.flink.shaded.curator.org.apache.curator.ConnectionState  -
>Authentication failed` 或许你可以先尝试解决下这个问题看看。
>
>[1]
>https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint
>Best,
>Congxian
>
>
>Zhou Zach  于2020年6月22日周一 下午3:03写道:
>
>> flink run -s 后面跟的参数是不是只能是savepointPath,不能是flnk job 自动checkpoint path吗
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-06-22 14:32:02,"Zhou Zach"  写道:
>> >重启了CDH6集群,还是报同样的错误,flink 故障恢复不成功,不敢上生产啊,哪位大佬帮忙看下啊
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >在 2020-06-22 13:21:01,"Zhou Zach"  写道:
>> >
>> >用yarn application kill flink job把yarn的application杀掉后,
>> >执行/opt/flink-1.10.0/bin/flink run -s
>> hdfs://nameservice1:8020/user/flink10/checkpoints/f1b6f5392cd5053db155e709ffe9f871/chk-15/_metadata
>> dataflow.sql.FromKafkaSinkJdbcForCountPerSecond
>> /data/warehouse/streaming/data-flow-1.0.jar,启动不起来,/opt/flink-1.10.0/log日志上传到附件了。。。
>> >
>> >
>> >执行/opt/flink-1.10.0/bin/flink run -c
>> dataflow.sql.FromKafkaSinkJdbcForCountPerSecond -m yarn-cluster -yjm 1024m
>> -ytm 8192m -p 2 -ys 4 -ynm UV -d data-flow-1.0.jar,是可以正常启动的,就是带上-s参数报错。。。
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >在 2020-06-21 09:16:45,"Congxian Qiu"  写道:
>> >>Hi
>> >>
>> >>这个作业的 application 有起来吗?起来了的话,可以看看 JM
>> >>log,如果没有起来,可以从提交客户端的那看看有没有更详细的提交日志。日志目录默认在 `/opt/flink-1.10.0/log` 下面
>> >>
>> >>Best,
>> >>Congxian
>> >>
>> >>
>> >>Zhou Zach  于2020年6月19日周五 下午8:15写道:
>> >>
>> >>> 我是per job模式,不是yarn session模式啊
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> At 2020-06-19 20:06:47, "Rui Li"  wrote:
>> >>> >那得重启yarn session,再把作业提交上去
>> >>> >
>> >>> >On Fri, Jun 19, 2020 at 6:22 PM Zhou Zach  wrote:
>> >>> >
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> 用yarn application kill flink
>> job把yarn的application杀掉了,杀掉后yarn没有重启flink
>> >>> job
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> 在 2020-06-19 17:54:45,"Rui Li"  写道:
>> >>> >> >用yarn application kill flink job是说把yarn的application杀掉了吗?杀掉以后有没有重启呀
>> >>> >> >
>> >>> >> >On Fri, Jun 19, 2020 at 4:09 PM Zhou Zach 
>> wrote:
>> >>> >> >
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> 在flink-1.10.0/conf/flink-conf.yaml中加了下面两个超时参数,不起作用
>> >>> >> >> akka.client.timeout: 6
>> >>> >> >> akka.ask.timeout: 600
>> >>> >> >>
>> >>> >> >> 有大佬知道是什么原因吗
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>>

Re:Re:Re:Re: Re: Re: flink run from checkpoit failed

2020-06-22 文章 Zhou Zach
flink run -s 后面跟的参数是不是只能是savepointPath,不能是flnk job 自动checkpoint path吗















在 2020-06-22 14:32:02,"Zhou Zach"  写道:
>重启了CDH6集群,还是报同样的错误,flink 故障恢复不成功,不敢上生产啊,哪位大佬帮忙看下啊
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2020-06-22 13:21:01,"Zhou Zach"  写道:
>
>用yarn application kill flink job把yarn的application杀掉后,
>执行/opt/flink-1.10.0/bin/flink run -s 
>hdfs://nameservice1:8020/user/flink10/checkpoints/f1b6f5392cd5053db155e709ffe9f871/chk-15/_metadata
>  dataflow.sql.FromKafkaSinkJdbcForCountPerSecond 
>/data/warehouse/streaming/data-flow-1.0.jar,启动不起来,/opt/flink-1.10.0/log日志上传到附件了。。。
>
>
>执行/opt/flink-1.10.0/bin/flink run -c 
>dataflow.sql.FromKafkaSinkJdbcForCountPerSecond -m yarn-cluster -yjm 1024m 
>-ytm 8192m -p 2 -ys 4 -ynm UV -d data-flow-1.0.jar,是可以正常启动的,就是带上-s参数报错。。。
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2020-06-21 09:16:45,"Congxian Qiu"  写道:
>>Hi
>>
>>这个作业的 application 有起来吗?起来了的话,可以看看 JM
>>log,如果没有起来,可以从提交客户端的那看看有没有更详细的提交日志。日志目录默认在 `/opt/flink-1.10.0/log` 下面
>>
>>Best,
>>Congxian
>>
>>
>>Zhou Zach  于2020年6月19日周五 下午8:15写道:
>>
>>> 我是per job模式,不是yarn session模式啊
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> At 2020-06-19 20:06:47, "Rui Li"  wrote:
>>> >那得重启yarn session,再把作业提交上去
>>> >
>>> >On Fri, Jun 19, 2020 at 6:22 PM Zhou Zach  wrote:
>>> >
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> 用yarn application kill flink job把yarn的application杀掉了,杀掉后yarn没有重启flink
>>> job
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> 在 2020-06-19 17:54:45,"Rui Li"  写道:
>>> >> >用yarn application kill flink job是说把yarn的application杀掉了吗?杀掉以后有没有重启呀
>>> >> >
>>> >> >On Fri, Jun 19, 2020 at 4:09 PM Zhou Zach  wrote:
>>> >> >
>>> >> >>
>>> >> >>
>>> >> >> 在flink-1.10.0/conf/flink-conf.yaml中加了下面两个超时参数,不起作用
>>> >> >> akka.client.timeout: 6
>>> >> >> akka.ask.timeout: 600
>>> >> >>
>>> >> >> 有大佬知道是什么原因吗
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> 在 2020-06-19 14:57:05,"Zhou Zach"  写道:
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >用yarn application kill flink job后,
>>> >> >> >执行/opt/flink-1.10.0/bin/flink run -s
>>> >> >>
>>> >>
>>> /user/flink10/checkpoints/69e450574d8520ac5961e20a6fc4798a/chk-18/_metadata
>>> >> >> -d -c dataflow.sql.FromKafkaSinkJdbcForCountPerSecond
>>> >> >> /data/warehouse/streaming/data-flow-1.0.jar
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >2020-06-19 14:39:54,563 INFO
>>> >> >>
>>> >>
>>> org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager
>>> >> >> - State change: CONNECTED
>>> >> >> >2020-06-19 14:39:54,664 INFO
>>> >> >>
>>> >>
>>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  -
>>> >> >> Starting ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
>>> >> >> >2020-06-19 14:40:24,728 INFO
>>> >> >>
>>> >>
>>> org.apache.flink.runtime.

Re:Re:Re: Re: Re: flink run from checkpoit failed

2020-06-22 文章 Zhou Zach
重启了CDH6集群,还是报同样的错误,flink 故障恢复不成功,不敢上生产啊,哪位大佬帮忙看下啊
















在 2020-06-22 13:21:01,"Zhou Zach"  写道:

用yarn application kill flink job把yarn的application杀掉后,
执行/opt/flink-1.10.0/bin/flink run -s 
hdfs://nameservice1:8020/user/flink10/checkpoints/f1b6f5392cd5053db155e709ffe9f871/chk-15/_metadata
  dataflow.sql.FromKafkaSinkJdbcForCountPerSecond 
/data/warehouse/streaming/data-flow-1.0.jar,启动不起来,/opt/flink-1.10.0/log日志上传到附件了。。。


执行/opt/flink-1.10.0/bin/flink run -c 
dataflow.sql.FromKafkaSinkJdbcForCountPerSecond -m yarn-cluster -yjm 1024m -ytm 
8192m -p 2 -ys 4 -ynm UV -d data-flow-1.0.jar,是可以正常启动的,就是带上-s参数报错。。。



















在 2020-06-21 09:16:45,"Congxian Qiu"  写道:
>Hi
>
>这个作业的 application 有起来吗?起来了的话,可以看看 JM
>log,如果没有起来,可以从提交客户端的那看看有没有更详细的提交日志。日志目录默认在 `/opt/flink-1.10.0/log` 下面
>
>Best,
>Congxian
>
>
>Zhou Zach  于2020年6月19日周五 下午8:15写道:
>
>> 我是per job模式,不是yarn session模式啊
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> At 2020-06-19 20:06:47, "Rui Li"  wrote:
>> >那得重启yarn session,再把作业提交上去
>> >
>> >On Fri, Jun 19, 2020 at 6:22 PM Zhou Zach  wrote:
>> >
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 用yarn application kill flink job把yarn的application杀掉了,杀掉后yarn没有重启flink
>> job
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 在 2020-06-19 17:54:45,"Rui Li"  写道:
>> >> >用yarn application kill flink job是说把yarn的application杀掉了吗?杀掉以后有没有重启呀
>> >> >
>> >> >On Fri, Jun 19, 2020 at 4:09 PM Zhou Zach  wrote:
>> >> >
>> >> >>
>> >> >>
>> >> >> 在flink-1.10.0/conf/flink-conf.yaml中加了下面两个超时参数,不起作用
>> >> >> akka.client.timeout: 6
>> >> >> akka.ask.timeout: 600
>> >> >>
>> >> >> 有大佬知道是什么原因吗
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> 在 2020-06-19 14:57:05,"Zhou Zach"  写道:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >用yarn application kill flink job后,
>> >> >> >执行/opt/flink-1.10.0/bin/flink run -s
>> >> >>
>> >>
>> /user/flink10/checkpoints/69e450574d8520ac5961e20a6fc4798a/chk-18/_metadata
>> >> >> -d -c dataflow.sql.FromKafkaSinkJdbcForCountPerSecond
>> >> >> /data/warehouse/streaming/data-flow-1.0.jar
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >2020-06-19 14:39:54,563 INFO
>> >> >>
>> >>
>> org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager
>> >> >> - State change: CONNECTED
>> >> >> >2020-06-19 14:39:54,664 INFO
>> >> >>
>> >>
>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  -
>> >> >> Starting ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
>> >> >> >2020-06-19 14:40:24,728 INFO
>> >> >>
>> >>
>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  -
>> >> >> Stopping ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
>> >> >> >2020-06-19 14:40:24,729 INFO
>> >> >>
>> >>
>> org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl
>> >> >> - backgroundOperationsLoop exiting
>> >> >> >2020-06-19 14:40:24,733 INFO
>> >> >> org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper  -
>> >> >> Session: 0x272b776faca2414 closed
>> >> >> >2020-06-19 14:40:24,733 INFO
>> >> >> org.apache.flink.shaded.zookeeper.org.apache.zookeeper.Clie

Re:Re: Re: flink run from checkpoit failed

2020-06-19 文章 Zhou Zach
我是per job模式,不是yarn session模式啊

















At 2020-06-19 20:06:47, "Rui Li"  wrote:
>那得重启yarn session,再把作业提交上去
>
>On Fri, Jun 19, 2020 at 6:22 PM Zhou Zach  wrote:
>
>>
>>
>>
>>
>>
>>
>> 用yarn application kill flink job把yarn的application杀掉了,杀掉后yarn没有重启flink job
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-06-19 17:54:45,"Rui Li"  写道:
>> >用yarn application kill flink job是说把yarn的application杀掉了吗?杀掉以后有没有重启呀
>> >
>> >On Fri, Jun 19, 2020 at 4:09 PM Zhou Zach  wrote:
>> >
>> >>
>> >>
>> >> 在flink-1.10.0/conf/flink-conf.yaml中加了下面两个超时参数,不起作用
>> >> akka.client.timeout: 6
>> >> akka.ask.timeout: 600
>> >>
>> >> 有大佬知道是什么原因吗
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 在 2020-06-19 14:57:05,"Zhou Zach"  写道:
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >用yarn application kill flink job后,
>> >> >执行/opt/flink-1.10.0/bin/flink run -s
>> >>
>> /user/flink10/checkpoints/69e450574d8520ac5961e20a6fc4798a/chk-18/_metadata
>> >> -d -c dataflow.sql.FromKafkaSinkJdbcForCountPerSecond
>> >> /data/warehouse/streaming/data-flow-1.0.jar
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >2020-06-19 14:39:54,563 INFO
>> >>
>> org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager
>> >> - State change: CONNECTED
>> >> >2020-06-19 14:39:54,664 INFO
>> >>
>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  -
>> >> Starting ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
>> >> >2020-06-19 14:40:24,728 INFO
>> >>
>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  -
>> >> Stopping ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
>> >> >2020-06-19 14:40:24,729 INFO
>> >>
>> org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl
>> >> - backgroundOperationsLoop exiting
>> >> >2020-06-19 14:40:24,733 INFO
>> >> org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper  -
>> >> Session: 0x272b776faca2414 closed
>> >> >2020-06-19 14:40:24,733 INFO
>> >> org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  -
>> >> EventThread shut down for session: 0x272b776faca2414
>> >> >2020-06-19 14:40:24,734 ERROR org.apache.flink.client.cli.CliFrontend
>> >>- Error while running the command.
>> >> >org.apache.flink.client.program.ProgramInvocationException: The main
>> >> method caused an error: java.util.concurrent.ExecutionException:
>> >> org.apache.flink.runtime.client.JobSubmissionException: Failed to submit
>> >> JobGraph.
>> >> >at
>> >>
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
>> >> >at
>> >>
>> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
>> >> >at
>> >> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
>> >> >at
>> >>
>> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
>> >> >at
>> >> org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>> >> >at
>> >>
>> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
>> >> >at
>> >>
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>> >> >at java.security.AccessController.doPrivileged(Native Method)
>> >> >at javax.security.auth.Subject.doAs(Subject.java:422)
>> >> >at
>> >>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>> >> >at
>> >>
>> org.apache.flink

Re:flink run from checkpoit failed

2020-06-19 文章 Zhou Zach


在flink-1.10.0/conf/flink-conf.yaml中加了下面两个超时参数,不起作用
akka.client.timeout: 6
akka.ask.timeout: 600

有大佬知道是什么原因吗














在 2020-06-19 14:57:05,"Zhou Zach"  写道:
>
>
>
>
>用yarn application kill flink job后,
>执行/opt/flink-1.10.0/bin/flink run -s 
>/user/flink10/checkpoints/69e450574d8520ac5961e20a6fc4798a/chk-18/_metadata -d 
>-c dataflow.sql.FromKafkaSinkJdbcForCountPerSecond  
>/data/warehouse/streaming/data-flow-1.0.jar
>
>
>
>
>
>
>
>
>2020-06-19 14:39:54,563 INFO  
>org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager
>  - State change: CONNECTED
>2020-06-19 14:39:54,664 INFO  
>org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
>Starting ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
>2020-06-19 14:40:24,728 INFO  
>org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
>Stopping ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
>2020-06-19 14:40:24,729 INFO  
>org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl
>  - backgroundOperationsLoop exiting
>2020-06-19 14:40:24,733 INFO  
>org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper  - Session: 
>0x272b776faca2414 closed
>2020-06-19 14:40:24,733 INFO  
>org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - 
>EventThread shut down for session: 0x272b776faca2414
>2020-06-19 14:40:24,734 ERROR org.apache.flink.client.cli.CliFrontend  
> - Error while running the command.
>org.apache.flink.client.program.ProgramInvocationException: The main method 
>caused an error: java.util.concurrent.ExecutionException: 
>org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
>JobGraph.
>at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
>at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
>at 
> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
>at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
>at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
>at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
>Caused by: java.lang.RuntimeException: 
>java.util.concurrent.ExecutionException: 
>org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
>JobGraph.
>at 
> org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:199)
>at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1741)
>at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:94)
>at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:63)
>at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1620)
>at 
> org.apache.flink.table.planner.delegation.StreamExecutor.execute(StreamExecutor.java:42)
>at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:643)
>at 
> cn.ibobei.qile.dataflow.sql.FromKafkaSinkJdbcForCountPerSecond$.main(FromKafkaSinkJdbcForCountPerSecond.scala:120)
>at 
> cn.ibobei.qile.dataflow.sql.FromKafkaSinkJdbcForCountPerSecond.main(FromKafkaSinkJdbcForCountPerSecond.scala)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.lang.reflect.Method.invoke(Method.java:498)
>at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:321)
>... 11 more
>Caused by: java.util.concurrent.ExecutionException: 
>org.apache.flink.runtime.client.JobSubmissionException: Failed to sub

flink job自动checkpoint是成功,手动checkpoint失败

2020-06-19 文章 Zhou Zach




2020-06-19 15:11:18,361 INFO  org.apache.flink.client.cli.CliFrontend   
- Triggering savepoint for job e229c76e6a1b43142cb4272523102ed1.
2020-06-19 15:11:18,378 INFO  org.apache.flink.client.cli.CliFrontend   
- Waiting for response...
2020-06-19 15:11:48,381 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Stopping ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
2020-06-19 15:11:48,382 INFO  
org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl
  - backgroundOperationsLoop exiting
2020-06-19 15:11:48,385 INFO  
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper  - Session: 
0x172b776fac82479 closed
2020-06-19 15:11:48,385 INFO  
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - 
EventThread shut down for session: 0x172b776fac82479
2020-06-19 15:11:48,385 ERROR org.apache.flink.client.cli.CliFrontend   
- Error while running the command.
org.apache.flink.util.FlinkException: Triggering a savepoint for the job 
e229c76e6a1b43142cb4272523102ed1 failed.
at 
org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:633)
at 
org.apache.flink.client.cli.CliFrontend.lambda$savepoint$9(CliFrontend.java:611)
at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:843)
at 
org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:608)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:910)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
Caused by: java.util.concurrent.TimeoutException
at 
org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:999)
at 
org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211)
at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$14(FutureUtils.java:427)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

flink run from checkpoit failed

2020-06-19 文章 Zhou Zach




用yarn application kill flink job后,
执行/opt/flink-1.10.0/bin/flink run -s 
/user/flink10/checkpoints/69e450574d8520ac5961e20a6fc4798a/chk-18/_metadata -d 
-c dataflow.sql.FromKafkaSinkJdbcForCountPerSecond  
/data/warehouse/streaming/data-flow-1.0.jar








2020-06-19 14:39:54,563 INFO  
org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager
  - State change: CONNECTED
2020-06-19 14:39:54,664 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Starting ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
2020-06-19 14:40:24,728 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Stopping ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
2020-06-19 14:40:24,729 INFO  
org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl
  - backgroundOperationsLoop exiting
2020-06-19 14:40:24,733 INFO  
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper  - Session: 
0x272b776faca2414 closed
2020-06-19 14:40:24,733 INFO  
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - 
EventThread shut down for session: 0x272b776faca2414
2020-06-19 14:40:24,734 ERROR org.apache.flink.client.cli.CliFrontend   
- Error while running the command.
org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
JobGraph.
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
JobGraph.
at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:199)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1741)
at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:94)
at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:63)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1620)
at 
org.apache.flink.table.planner.delegation.StreamExecutor.execute(StreamExecutor.java:42)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:643)
at 
cn.ibobei.qile.dataflow.sql.FromKafkaSinkJdbcForCountPerSecond$.main(FromKafkaSinkJdbcForCountPerSecond.scala:120)
at 
cn.ibobei.qile.dataflow.sql.FromKafkaSinkJdbcForCountPerSecond.main(FromKafkaSinkJdbcForCountPerSecond.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:321)
... 11 more
Caused by: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
JobGraph.
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1736)
... 23 more
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph.
at 
org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$7(RestClusterClient.java:359)
at 

Re:Re: 项目引用flink-1.11.0,打包失败

2020-06-18 文章 Zhou Zach
import org.apache.flink.api.common.time.Time
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment
import org.apache.flink.streaming.api.{CheckpointingMode, TimeCharacteristic}
import org.apache.flink.table.api.EnvironmentSettings
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment

















在 2020-06-18 19:41:08,"Jark Wu"  写道:
>能贴下完整代码吗? (imports 部分)
>
>Best,
>Jark
>
>On Thu, 18 Jun 2020 at 19:18, Zhou Zach  wrote:
>
>>
>>
>> flink-1.10.0版本,引用的是org.apache.flink.table.api.java.StreamTableEnvironment,换成flink-1.11.0时,intellij
>> idea提示要换成org.apache.flink.table.api.bridge.java.StreamTableEnvironment,Intellij
>> Idea Build可以成功,就是打包的时候出错。。
>>
>>
>>
>>
>> [ERROR]
>> /Users/Zach/flink-common_1.11.0/src/main/scala/org/rabbit/sql/FromKafkaSinkJdbcForUserUV.scala:7:
>> error: object StreamTableEnvironment is not a member of package
>> org.apache.flink.table.api.bridge.java
>> [ERROR] import
>> org.apache.flink.table.api.bridge.java.StreamTableEnvironment
>>
>>
>>
>>
>> 代码:
>> val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
>>
>> streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
>> streamExecutionEnv.enableCheckpointing(20 * 1000,
>> CheckpointingMode.EXACTLY_ONCE)
>> streamExecutionEnv.getCheckpointConfig.setCheckpointTimeout(900 * 1000)
>>
>> val blinkEnvSettings =
>> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
>> val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv,
>> blinkEnvSettings)
>> pom.xml:
>> 
>>   UTF-8
>> 
>> 1.11-SNAPSHOT
>>   1.8
>>   2.11.12
>>   2.11
>>   ${java.version}
>>   ${java.version}
>>
>> 
>>  org.apache.flink
>>  flink-java
>>  ${flink.version}
>> 
>> 
>>   
>>  org.apache.flink
>>
>>  flink-streaming-java_${scala.binary.version}
>>  ${flink.version}
>> 
>> 
>>
>> 
>> 
>>  org.apache.flink
>>  flink-table
>>  ${flink.version}
>>  pom
>> 
>> 
>>
>>   
>>  org.apache.flink
>>  flink-scala_2.11
>>  ${flink.version}
>> 
>> 
>>   
>>  org.apache.flink
>>  flink-jdbc_2.11
>>  ${flink.version}
>>  provided
>>   
>>
>>   
>>  org.apache.flink
>>  flink-streaming-scala_2.11
>>  ${flink.version}
>> 
>> 
>>
>>   
>>  org.apache.flink
>>  flink-table-common
>>  ${flink.version}
>> 
>> 
>> 
>> 
>>  org.apache.flink
>>  flink-table-api-scala-bridge_2.11
>>  ${flink.version}
>> 
>> 
>>
>> 
>> 
>>  org.apache.flink
>>  flink-table-api-scala_2.11
>>  ${flink.version}
>> 
>> 
>>
>>
>>
>>
>> 
>>
>>   
>>   
>>
>>
>> 
>>  org.apache.flink
>>  flink-connector-kafka_2.11
>>  ${flink.version}
>>  provided
>>   
>>   
>>  org.apache.flink
>>  flink-avro
>>  ${flink.version}
>>  provided
>>   
>>   
>>  org.apache.flink
>>  flink-csv
>>  ${flink.version}
>>  provided
>>   
>> 
>> 
>>  org.apache.flink
>>  flink-json
>>  ${flink.version}
>>  provided
>>   
>>
>>
>> 
>>
>>
>> 
>>  org.apache.bahir
>>  flink-connector-redis_2.11
>>  1.0
>>  provided
>>   
>>
>> 
>> 
>>  org.apache.flink
>>  flink-connector-hive_2.11
>>  ${flink.version}
>>  provided
>>   
>>
>> 
>> 
>> 
>> 
>> 
>> 
>>
>> 
>>  org.apache.flink
>>  flink-table-api-java
>>  ${flink.version}
>>  provided
>>   
>>
>> 
>> 
>>  org.apache.flink
>>  flink-table-planner_2.11
>>  ${flink.version}
>> 
>> 
>>
>>   
>>  org.apache.flink
>>  flink-table-planner-blink_2.11
>>  ${flink.version}
>>  provided
>>   
>> 
>> 
>>  org.apache.flink
>>  flink-sql-connector-kafka_2.11
>>  ${flink.version}
>>  provided
>>   
>>
>>
>>   
>>  org.apache.flink
>>  flink-connector-hbase_2.11
>>  ${flink.version}
>>   


项目引用flink-1.11.0,打包失败

2020-06-18 文章 Zhou Zach


flink-1.10.0版本,引用的是org.apache.flink.table.api.java.StreamTableEnvironment,换成flink-1.11.0时,intellij
 
idea提示要换成org.apache.flink.table.api.bridge.java.StreamTableEnvironment,Intellij 
Idea Build可以成功,就是打包的时候出错。。




[ERROR] 
/Users/Zach/flink-common_1.11.0/src/main/scala/org/rabbit/sql/FromKafkaSinkJdbcForUserUV.scala:7:
 error: object StreamTableEnvironment is not a member of package 
org.apache.flink.table.api.bridge.java
[ERROR] import org.apache.flink.table.api.bridge.java.StreamTableEnvironment




代码:
val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
streamExecutionEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
streamExecutionEnv.enableCheckpointing(20 * 1000, 
CheckpointingMode.EXACTLY_ONCE)
streamExecutionEnv.getCheckpointConfig.setCheckpointTimeout(900 * 1000)

val blinkEnvSettings = 
EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv, 
blinkEnvSettings)
pom.xml:

  UTF-8

1.11-SNAPSHOT
  1.8
  2.11.12
  2.11
  ${java.version}
  ${java.version}
   

 org.apache.flink
 flink-java
 ${flink.version}


  
 org.apache.flink
 flink-streaming-java_${scala.binary.version}
 ${flink.version}





 org.apache.flink
 flink-table
 ${flink.version}
 pom



  
 org.apache.flink
 flink-scala_2.11
 ${flink.version}


  
 org.apache.flink
 flink-jdbc_2.11
 ${flink.version}
 provided
  

  
 org.apache.flink
 flink-streaming-scala_2.11
 ${flink.version}



  
 org.apache.flink
 flink-table-common
 ${flink.version}




 org.apache.flink
 flink-table-api-scala-bridge_2.11
 ${flink.version}





 org.apache.flink
 flink-table-api-scala_2.11
 ${flink.version}








  
  



 org.apache.flink
 flink-connector-kafka_2.11
 ${flink.version}
 provided
  
  
 org.apache.flink
 flink-avro
 ${flink.version}
 provided
  
  
 org.apache.flink
 flink-csv
 ${flink.version}
 provided
  


 org.apache.flink
 flink-json
 ${flink.version}
 provided
  






 org.apache.bahir
 flink-connector-redis_2.11
 1.0
 provided
  



 org.apache.flink
 flink-connector-hive_2.11
 ${flink.version}
 provided
  









 org.apache.flink
 flink-table-api-java
 ${flink.version}
 provided
  



 org.apache.flink
 flink-table-planner_2.11
 ${flink.version}



  
 org.apache.flink
 flink-table-planner-blink_2.11
 ${flink.version}
 provided
  


 org.apache.flink
 flink-sql-connector-kafka_2.11
 ${flink.version}
 provided
  


  
 org.apache.flink
 flink-connector-hbase_2.11
 ${flink.version}
  

Re:flink sql sink mysql requires primary keys

2020-06-17 文章 Zhou Zach
加了primary key报错,
Exception in thread "main" 
org.apache.flink.table.planner.operations.SqlConversionException: Primary key 
and unique key are not supported yet.
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertCreateTable(SqlToOperationConverter.java:169)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:130)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:66)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)
at 
org.rabbit.sql.FromKafkaSinkJdbcForUserUV$.main(FromKafkaSinkJdbcForUserUV.scala:52)
at 
org.rabbit.sql.FromKafkaSinkJdbcForUserUV.main(FromKafkaSinkJdbcForUserUV.scala)


Query:


streamTableEnv.sqlUpdate(
"""
|
|CREATE TABLE user_uv(
|`time` VARCHAR,
|cnt bigint,
|PRIMARY KEY (`time`)
|) WITH (
|'connector.type' = 'jdbc',
|'connector.write.flush.max-rows' = '1'
|)
|""".stripMargin)

















At 2020-06-17 20:59:35, "Zhou Zach"  wrote:
>Exception in thread "main" org.apache.flink.table.api.TableException: 
>UpsertStreamTableSink requires that Table has a full primary keys if it is 
>updated.
>   at 
> org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:113)
>   at 
> org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:48)
>   at 
> org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:58)
>   at 
> org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlan(StreamExecSink.scala:48)
>   at 
> org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:60)
>   at 
> org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:59)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:59)
>   at 
> org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:153)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:682)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:495)
>   at 
> org.rabbit.sql.FromKafkaSinkJdbcForUserUV$.main(FromKafkaSinkJdbcForUserUV.scala:68)
>   at 
> org.rabbit.sql.FromKafkaSinkJdbcForUserUV.main(FromKafkaSinkJdbcForUserUV.scala)
>
>
>
>
>
>Query:
>Flink :1.10.0
>CREATE TABLE user_uv(
>|`time` VARCHAR,
>|cnt bigint
>|) WITH (
>|'connector.type' = 'jdbc')
>|insert into user_uv
>|select  MAX(DATE_FORMAT(created_time, '-MM-dd HH:mm:00')) as `time`, 
>COUNT(DISTINCT  uid) as cnt
>|from `user`
>|group by DATE_FORMAT(created_time, '-MM-dd HH:mm:00')


Re:Re: flink sql DDL Unsupported update-mode hbase

2020-06-16 文章 Zhou Zach
那flink sql DDL的方式,读写,更新,删除hbase都是支持的吧

















At 2020-06-17 13:45:15, "Jark Wu"  wrote:
>Hi,
>
>HBase connector 不用声明 update-mode 属性。 也不能声明。
>
>Best,
>Jark
>
>On Wed, 17 Jun 2020 at 13:08, Zhou Zach  wrote:
>
>> The program finished with the following exception:
>>
>>
>> org.apache.flink.client.program.ProgramInvocationException: The main
>> method caused an error: Could not find a suitable table factory for
>> 'org.apache.flink.table.factories.TableSinkFactory' in
>> the classpath.
>>
>>
>> Reason: No factory supports all properties.
>>
>>
>> The matching candidates:
>> org.apache.flink.addons.hbase.HBaseTableFactory
>> Unsupported property keys:
>> update-mode
>>
>>
>> The following properties are requested:
>> connector.table-name=user_hbase10
>> connector.type=hbase
>> connector.version=2.1.0
>> connector.write.buffer-flush.interval=2s
>> connector.write.buffer-flush.max-rows=1000
>> connector.write.buffer-flush.max-size=10mb
>> connector.zookeeper.quorum=cdh1:2181,cdh2:2181,cdh3:2181
>> connector.zookeeper.znode.parent=/hbase
>> schema.0.data-type=VARCHAR(2147483647)
>> schema.0.name=rowkey
>> schema.1.data-type=ROW<`sex` VARCHAR(2147483647), `age` INT,
>> `created_time` TIMESTAMP(3)
>> schema.1.name=cf
>> update-mode=upsert
>>
>>
>> The following factories have been considered:
>> org.apache.flink.addons.hbase.HBaseTableFactory
>> org.apache.flink.api.java.io.jdbc.JDBCTableSourceSinkFactory
>> org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory
>> org.apache.flink.table.sinks.CsvBatchTableSinkFactory
>> org.apache.flink.table.sinks.CsvAppendTableSinkFactory
>> at
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
>> at
>> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
>> at
>> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
>> at
>> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
>> at
>> org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>> at
>> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
>> at
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>> at
>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> at
>> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
>> Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException:
>> Could not find a suitable table factory for
>> 'org.apache.flink.table.factories.TableSinkFactory' in
>> the classpath.
>>
>>
>> Reason: No factory supports all properties.
>>
>>
>> The matching candidates:
>> org.apache.flink.addons.hbase.HBaseTableFactory
>> Unsupported property keys:
>> update-mode
>>
>>
>> The following properties are requested:
>> connector.table-name=user_hbase10
>> connector.type=hbase
>> connector.version=2.1.0
>> connector.write.buffer-flush.interval=2s
>> connector.write.buffer-flush.max-rows=1000
>> connector.write.buffer-flush.max-size=10mb
>> connector.zookeeper.quorum=cdh1:2181,cdh2:2181,cdh3:2181
>> connector.zookeeper.znode.parent=/hbase
>> schema.0.data-type=VARCHAR(2147483647)
>> schema.0.name=rowkey
>> schema.1.data-type=ROW<`sex` VARCHAR(2147483647), `age` INT,
>> `created_time` TIMESTAMP(3)
>> schema.1.name=cf
>> update-mode=upsert
>>
>>
>> The following factories have been considered:
>> org.apache.flink.addons.hbase.HBaseTableFactory
>> org.apache.flink.api.java.io.jdbc.JDBCTableSourceSinkFactory
>> org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory
>> org.apache.flink.table.sinks.CsvBatchTableSinkFactory
>> org.apache.flink.table.sinks.CsvAppendTableSinkFactory
>> at
>> org.apache.flink.table.factories.TableFactoryService.filterBySupportedProperties(TableFactoryService.java:434)
>> at
>> org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryServ

flink sql DDL Unsupported update-mode hbase

2020-06-16 文章 Zhou Zach
The program finished with the following exception:


org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: Could not find a suitable table factory for 
'org.apache.flink.table.factories.TableSinkFactory' in
the classpath.


Reason: No factory supports all properties.


The matching candidates:
org.apache.flink.addons.hbase.HBaseTableFactory
Unsupported property keys:
update-mode


The following properties are requested:
connector.table-name=user_hbase10
connector.type=hbase
connector.version=2.1.0
connector.write.buffer-flush.interval=2s
connector.write.buffer-flush.max-rows=1000
connector.write.buffer-flush.max-size=10mb
connector.zookeeper.quorum=cdh1:2181,cdh2:2181,cdh3:2181
connector.zookeeper.znode.parent=/hbase
schema.0.data-type=VARCHAR(2147483647)
schema.0.name=rowkey
schema.1.data-type=ROW<`sex` VARCHAR(2147483647), `age` INT, `created_time` 
TIMESTAMP(3)
schema.1.name=cf
update-mode=upsert


The following factories have been considered:
org.apache.flink.addons.hbase.HBaseTableFactory
org.apache.flink.api.java.io.jdbc.JDBCTableSourceSinkFactory
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory
org.apache.flink.table.sinks.CsvBatchTableSinkFactory
org.apache.flink.table.sinks.CsvAppendTableSinkFactory
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could 
not find a suitable table factory for 
'org.apache.flink.table.factories.TableSinkFactory' in
the classpath.


Reason: No factory supports all properties.


The matching candidates:
org.apache.flink.addons.hbase.HBaseTableFactory
Unsupported property keys:
update-mode


The following properties are requested:
connector.table-name=user_hbase10
connector.type=hbase
connector.version=2.1.0
connector.write.buffer-flush.interval=2s
connector.write.buffer-flush.max-rows=1000
connector.write.buffer-flush.max-size=10mb
connector.zookeeper.quorum=cdh1:2181,cdh2:2181,cdh3:2181
connector.zookeeper.znode.parent=/hbase
schema.0.data-type=VARCHAR(2147483647)
schema.0.name=rowkey
schema.1.data-type=ROW<`sex` VARCHAR(2147483647), `age` INT, `created_time` 
TIMESTAMP(3)
schema.1.name=cf
update-mode=upsert


The following factories have been considered:
org.apache.flink.addons.hbase.HBaseTableFactory
org.apache.flink.api.java.io.jdbc.JDBCTableSourceSinkFactory
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory
org.apache.flink.table.sinks.CsvBatchTableSinkFactory
org.apache.flink.table.sinks.CsvAppendTableSinkFactory
at 
org.apache.flink.table.factories.TableFactoryService.filterBySupportedProperties(TableFactoryService.java:434)
at 
org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:195)
at 
org.apache.flink.table.factories.TableFactoryService.findSingleInternal(TableFactoryService.java:143)
at 
org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.java:96)
at 
org.apache.flink.table.planner.delegation.PlannerBase.getTableSink(PlannerBase.scala:310)
at 
org.apache.flink.table.planner.delegation.PlannerBase.translateToRel(PlannerBase.scala:190)
at 
org.apache.flink.table.planner.delegation.PlannerBase$$anonfun$1.apply(PlannerBase.scala:150)
at 
org.apache.flink.table.planner.delegation.PlannerBase$$anonfun$1.apply(PlannerBase.scala:150)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
  

?????? flink sql ????????ROW??????????INT

2020-06-16 文章 Zhou Zach
??hbase??hbase




----
??:"Leonard Xu"

?????? flink sql ????????ROW??????????INT

2020-06-16 文章 Zhou Zach
??.??
offset (0) + length (4) exceed the capacity of the array: 2
 ?? hbaseint??
??users.addColumn("cf", "age", classOf[Integer]) ??
??int??IntegerInteger??int





----
??:"Leonard Xu"

?????? Re: flink sql read hbase sink mysql data type not match

2020-06-16 文章 Zhou Zach
2020-06-16 21:01:09,756 INFO 
org.apache.flink.kafka.shaded.org.apache.kafka.common.utils.AppInfoParser 
- Kafka version: unknown
2020-06-16 21:01:09,757 INFO 
org.apache.flink.kafka.shaded.org.apache.kafka.common.utils.AppInfoParser 
- Kafka commitId: unknown
2020-06-16 21:01:09,758 INFO 
org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.KafkaConsumer
 - [Consumer clientId=consumer-7, groupId=null] Subscribed to partition(s): 
user_behavior-0
2020-06-16 21:01:09,765 INFO 
org.apache.flink.kafka.shaded.org.apache.kafka.clients.Metadata - Cluster 
ID: cAT_xBISQNWghT9kR5UuIw
2020-06-16 21:01:09,766 WARN 
org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.ConsumerConfig
 - The configuration 'zookeeper.connect' was supplied but isn't a known config.
2020-06-16 21:01:09,766 INFO 
org.apache.flink.kafka.shaded.org.apache.kafka.common.utils.AppInfoParser 
- Kafka version: unknown
2020-06-16 21:01:09,767 INFO 
org.apache.flink.kafka.shaded.org.apache.kafka.common.utils.AppInfoParser 
- Kafka commitId: unknown
2020-06-16 21:01:09,768 INFO 
org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.internals.Fetcher
 - [Consumer clientId=consumer-7, groupId=null] Resetting offset for partition 
user_behavior-0 to offset 43545.
2020-06-16 21:01:35,904 INFO 
org.apache.flink.addons.hbase.HBaseLookupFunction
  - start close ...
2020-06-16 21:01:35,906 INFO 
org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient
  - Close zookeeper connection 0x72d39885 to 
cdh1:2181,cdh2:2181,cdh3:2181
2020-06-16 21:01:35,908 INFO 
org.apache.flink.addons.hbase.HBaseLookupFunction
  - end close.
2020-06-16 21:01:35,908 INFO org.apache.zookeeper.ZooKeeper  
   
   - Session: 0x172b776fac80ae4 closed
2020-06-16 21:01:35,909 INFO org.apache.zookeeper.ClientCnxn  
   
  - EventThread shut down
2020-06-16 21:01:35,911 INFO 
org.apache.flink.runtime.taskmanager.Task 
 - Source: KafkaTableSource(uid, 
phoneType, clickCount, time) - 
SourceConversion(table=[default_catalog.default_database.user_behavior, source: 
[KafkaTableSource(uid, phoneType, clickCount, time)]], fields=[uid, phoneType, 
clickCount, time]) - Calc(select=[uid, time]) - 
LookupJoin(table=[HBaseTableSource[schema=[rowkey, cf], projectFields=null]], 
joinType=[InnerJoin], async=[false], lookup=[rowkey=uid], select=[uid, time, 
rowkey, cf]) - Calc(select=[CAST(time) AS time, cf.age AS age]) - 
SinkConversionToTuple2 - Sink: JDBCUpsertTableSink(time, age) (1/2) 
(e45989f173dc35aefc52413349db7f30) switched from RUNNING to FAILED.
java.lang.IllegalArgumentException: offset (0) + length (4) exceed the capacity 
of the array: 2
at 
org.apache.hadoop.hbase.util.Bytes.explainWrongLengthOrOffset(Bytes.java:838)
at org.apache.hadoop.hbase.util.Bytes.toInt(Bytes.java:1004)
at org.apache.hadoop.hbase.util.Bytes.toInt(Bytes.java:980)
at 
org.apache.flink.addons.hbase.util.HBaseTypeUtils.deserializeToObject(HBaseTypeUtils.java:55)
at 
org.apache.flink.addons.hbase.util.HBaseReadWriteHelper.parseToRow(HBaseReadWriteHelper.java:158)
at 
org.apache.flink.addons.hbase.HBaseLookupFunction.eval(HBaseLookupFunction.java:78)
at LookupFunction$12.flatMap(Unknown Source)
at 
org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:82)
at 
org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:36)
at 
org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:641)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:616)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:596)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:730)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:708)
at StreamExecCalc$7.processElement(Unknown Source)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:641)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:616)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:596)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:730)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:708)
at SourceConversion$6.processElement(Unknown Source)
  

flink sql ????????ROW??????????INT

2020-06-16 文章 Zhou Zach
flink sql??HBase??ROWROW??INT
select cast(cf as Int) cf from hbase_table
??

Re:Re: flink sql read hbase sink mysql data type not match

2020-06-16 文章 Zhou Zach
flink sql ??ROW<`age` INT??INT??


streamTableEnv.sqlUpdate(
  """
|
|insert into  user_age
|SELECT rowkey, cast(cf as int) as age
|FROM
|  users
|
|""".stripMargin)??

flink sql read hbase sink mysql data type not match

2020-06-16 文章 Zhou Zach


org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: Field types of query result and registered TableSink 
default_catalog.default_database.user_age do not match.
Query schema: [rowkey: STRING, cf: ROW<`age` INT>]
Sink schema: [rowkey: STRING, age: INT]
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
Caused by: org.apache.flink.table.api.ValidationException: Field types of query 
result and registered TableSink default_catalog.default_database.user_age do 
not match.








query:




val users = new HBaseTableSource(hConf, "user_hbase5")
users.setRowKey("rowkey", classOf[String]) // currency as the primary key
users.addColumn("cf", "age", classOf[Integer])

streamTableEnv.registerTableSource("users", users)


streamTableEnv.sqlUpdate(
"""
|
|CREATE TABLE user_age (
|`rowkey` VARCHAR,
|age INT
|) WITH (
|'connector.type' = 'jdbc',
|'connector.write.flush.max-rows' = '1'
|)
|""".stripMargin)

streamTableEnv.sqlUpdate(
"""
|
|insert into  user_age
|SELECT *
|FROM
|  users
|
|""".stripMargin)

Re:Re: Re: Re:Re: flink sql job 提交到yarn上报错

2020-06-16 文章 Zhou Zach
有输出的

















在 2020-06-16 15:24:29,"王松"  写道:
>那你在命令行执行:hadoop classpath,有hadoop的classpath输出吗?
>
>Zhou Zach  于2020年6月16日周二 下午3:22写道:
>
>>
>>
>>
>>
>>
>>
>> 在/etc/profile下,目前只加了
>> export HADOOP_CLASSPATH=`hadoop classpath`
>> 我是安装的CDH,没找到sbin这个文件。。
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-06-16 15:05:12,"王松"  写道:
>> >你配置HADOOP_HOME和HADOOP_CLASSPATH这两个环境变量了吗?
>> >
>> >export HADOOP_HOME=/usr/local/hadoop-2.7.2
>> >export HADOOP_CLASSPATH=`hadoop classpath`
>> >export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
>> >
>> >Zhou Zach  于2020年6月16日周二 下午2:53写道:
>> >
>> >> flink/lib/下的jar:
>> >> flink-connector-hive_2.11-1.10.0.jar
>> >> flink-dist_2.11-1.10.0.jar
>> >> flink-jdbc_2.11-1.10.0.jar
>> >> flink-json-1.10.0.jar
>> >> flink-shaded-hadoop-2-3.0.0-cdh6.3.0-7.0.jar
>> >> flink-sql-connector-kafka_2.11-1.10.0.jar
>> >> flink-table_2.11-1.10.0.jar
>> >> flink-table-blink_2.11-1.10.0.jar
>> >> hbase-client-2.1.0.jar
>> >> hbase-common-2.1.0.jar
>> >> hive-exec-2.1.1.jar
>> >> mysql-connector-java-5.1.49.jar
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 在 2020-06-16 14:48:43,"Zhou Zach"  写道:
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >high-availability.storageDir: hdfs:///flink/ha/
>> >> >high-availability.zookeeper.quorum: cdh1:2181,cdh2:2181,cdh3:2181
>> >> >state.backend: filesystem
>> >> >state.checkpoints.dir:
>> hdfs://nameservice1:8020//user/flink10/checkpoints
>> >> >state.savepoints.dir: hdfs://nameservice1:8020//user/flink10/savepoints
>> >> >high-availability.zookeeper.path.root: /flink
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >在 2020-06-16 14:44:02,"王松"  写道:
>> >> >>你的配置文件中ha配置可以贴下吗
>> >> >>
>> >> >>Zhou Zach  于2020年6月16日周二 下午1:49写道:
>> >> >>
>> >> >>> org.apache.flink.runtime.entrypoint.ClusterEntrypointException:
>> Failed
>> >> to
>> >> >>> initialize the cluster entrypoint YarnJobClusterEntrypoint.
>> >> >>>
>> >> >>> at
>> >> >>>
>> >>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)
>> >> >>>
>> >> >>> at
>> >> >>>
>> >>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
>> >> >>>
>> >> >>> at
>> >> >>>
>> >>
>> org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)
>> >> >>>
>> >> >>> Caused by: java.io.IOException: Could not create FileSystem for
>> highly
>> >> >>> available storage path
>> (hdfs:/flink/ha/application_1592215995564_0027)
>> >> >>>
>> >> >>> at
>> >> >>>
>> >>
>> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:103)
>> >> >>>
>> >> >>> at
>> >> >>>
>> >>
>> org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:89)
>> >> >>>
>> >> >>> at
>> >> >>>
>> >>
>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:125)
>> >> >>>
>> >> >>> at
>> >> >>>
>> >>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEn

Re:Re: Re:Re: flink sql job 提交到yarn上报错

2020-06-16 文章 Zhou Zach






在/etc/profile下,目前只加了
export HADOOP_CLASSPATH=`hadoop classpath`
我是安装的CDH,没找到sbin这个文件。。











在 2020-06-16 15:05:12,"王松"  写道:
>你配置HADOOP_HOME和HADOOP_CLASSPATH这两个环境变量了吗?
>
>export HADOOP_HOME=/usr/local/hadoop-2.7.2
>export HADOOP_CLASSPATH=`hadoop classpath`
>export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
>
>Zhou Zach  于2020年6月16日周二 下午2:53写道:
>
>> flink/lib/下的jar:
>> flink-connector-hive_2.11-1.10.0.jar
>> flink-dist_2.11-1.10.0.jar
>> flink-jdbc_2.11-1.10.0.jar
>> flink-json-1.10.0.jar
>> flink-shaded-hadoop-2-3.0.0-cdh6.3.0-7.0.jar
>> flink-sql-connector-kafka_2.11-1.10.0.jar
>> flink-table_2.11-1.10.0.jar
>> flink-table-blink_2.11-1.10.0.jar
>> hbase-client-2.1.0.jar
>> hbase-common-2.1.0.jar
>> hive-exec-2.1.1.jar
>> mysql-connector-java-5.1.49.jar
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-06-16 14:48:43,"Zhou Zach"  写道:
>> >
>> >
>> >
>> >
>> >high-availability.storageDir: hdfs:///flink/ha/
>> >high-availability.zookeeper.quorum: cdh1:2181,cdh2:2181,cdh3:2181
>> >state.backend: filesystem
>> >state.checkpoints.dir: hdfs://nameservice1:8020//user/flink10/checkpoints
>> >state.savepoints.dir: hdfs://nameservice1:8020//user/flink10/savepoints
>> >high-availability.zookeeper.path.root: /flink
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >在 2020-06-16 14:44:02,"王松"  写道:
>> >>你的配置文件中ha配置可以贴下吗
>> >>
>> >>Zhou Zach  于2020年6月16日周二 下午1:49写道:
>> >>
>> >>> org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed
>> to
>> >>> initialize the cluster entrypoint YarnJobClusterEntrypoint.
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)
>> >>>
>> >>> Caused by: java.io.IOException: Could not create FileSystem for highly
>> >>> available storage path (hdfs:/flink/ha/application_1592215995564_0027)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:103)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:89)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:125)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:305)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:263)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:207)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)
>> >>>
>> >>> at java.security.AccessController.doPrivileged(Native Method)
>> >>>
>> >>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> >>>
>> >>> at
>> >>>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> >>>
>> >>> at
>> >>>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)
>> >>>
>> >>> ... 2 more
>> >>>
>> >>> Caused by:
>>

Re:Re:Re: flink sql job 提交到yarn上报错

2020-06-16 文章 Zhou Zach
flink/lib/下的jar:
flink-connector-hive_2.11-1.10.0.jar
flink-dist_2.11-1.10.0.jar
flink-jdbc_2.11-1.10.0.jar
flink-json-1.10.0.jar
flink-shaded-hadoop-2-3.0.0-cdh6.3.0-7.0.jar
flink-sql-connector-kafka_2.11-1.10.0.jar
flink-table_2.11-1.10.0.jar
flink-table-blink_2.11-1.10.0.jar
hbase-client-2.1.0.jar
hbase-common-2.1.0.jar
hive-exec-2.1.1.jar
mysql-connector-java-5.1.49.jar

















在 2020-06-16 14:48:43,"Zhou Zach"  写道:
>
>
>
>
>high-availability.storageDir: hdfs:///flink/ha/
>high-availability.zookeeper.quorum: cdh1:2181,cdh2:2181,cdh3:2181
>state.backend: filesystem
>state.checkpoints.dir: hdfs://nameservice1:8020//user/flink10/checkpoints
>state.savepoints.dir: hdfs://nameservice1:8020//user/flink10/savepoints
>high-availability.zookeeper.path.root: /flink
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2020-06-16 14:44:02,"王松"  写道:
>>你的配置文件中ha配置可以贴下吗
>>
>>Zhou Zach  于2020年6月16日周二 下午1:49写道:
>>
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to
>>> initialize the cluster entrypoint YarnJobClusterEntrypoint.
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
>>>
>>> at
>>> org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)
>>>
>>> Caused by: java.io.IOException: Could not create FileSystem for highly
>>> available storage path (hdfs:/flink/ha/application_1592215995564_0027)
>>>
>>> at
>>> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:103)
>>>
>>> at
>>> org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:89)
>>>
>>> at
>>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:125)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:305)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:263)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:207)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)
>>>
>>> at java.security.AccessController.doPrivileged(Native Method)
>>>
>>> at javax.security.auth.Subject.doAs(Subject.java:422)
>>>
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>>>
>>> at
>>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)
>>>
>>> ... 2 more
>>>
>>> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
>>> Could not find a file system implementation for scheme 'hdfs'. The scheme
>>> is not directly supported by Flink and no Hadoop file system to support
>>> this scheme could be loaded.
>>>
>>> at
>>> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:450)
>>>
>>> at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362)
>>>
>>> at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
>>>
>>> at
>>> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:100)
>>>
>>> ... 13 more
>>>
>>> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
>>> Cannot support file system for 'hdfs' via Hadoop, because Hadoop is not in
>>> the classpath, or some classes are missing from the classpath.
>>>
>>> at
>>> org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:184)
>>>
>>> at
>>> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:446)
>>>
>>> ... 16 more
>>>
>>> Caused by: java.lang.VerifyError: Bad return type
>>>
>>> Exception Details:
>>>
>>>   Location:
>>>
>>>
>>> org/apache/hadoop/hdfs/DFSClient.getQuotaUsage(Ljava/lang/String;)Lorg/apache/hadoop/fs/QuotaUsage;
>>> @160: areturn
>>>
>>>   Reason:
>>>
>>> Type 'org/apache/hadoop/fs/ContentSummary' (current frame, stack[0])
>>> is not assignable to 'org/apache/hadoop/fs/QuotaUsage' (from method
>>> signature)
>>>
>>>   Current Frame:
>>>
>>> bci: @160
>>>
>>> flags: { }
>>>
>>> locals: { 'org/apache/hadoop/hdfs/DFSClient', 'java/lang/String',
>>> 'org/apache/hadoop/ipc/RemoteException', 'java/io/IOException' }
>>> stack: { 'org/apache/hadoop/fs/ContentSummary' }
>>>
>>>
>>>
>>> 在本地intellij idea中可以正常运行,flink job上订阅kafka,sink到mysql和hbase,集群flink lib目录下,
>>>
>>>
>>>
>>>


Re:Re:Re: flink sql job 提交到yarn上报错

2020-06-16 文章 Zhou Zach
high-availability: zookeeper

















在 2020-06-16 14:48:43,"Zhou Zach"  写道:
>
>
>
>
>high-availability.storageDir: hdfs:///flink/ha/
>high-availability.zookeeper.quorum: cdh1:2181,cdh2:2181,cdh3:2181
>state.backend: filesystem
>state.checkpoints.dir: hdfs://nameservice1:8020//user/flink10/checkpoints
>state.savepoints.dir: hdfs://nameservice1:8020//user/flink10/savepoints
>high-availability.zookeeper.path.root: /flink
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2020-06-16 14:44:02,"王松"  写道:
>>你的配置文件中ha配置可以贴下吗
>>
>>Zhou Zach  于2020年6月16日周二 下午1:49写道:
>>
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to
>>> initialize the cluster entrypoint YarnJobClusterEntrypoint.
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
>>>
>>> at
>>> org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)
>>>
>>> Caused by: java.io.IOException: Could not create FileSystem for highly
>>> available storage path (hdfs:/flink/ha/application_1592215995564_0027)
>>>
>>> at
>>> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:103)
>>>
>>> at
>>> org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:89)
>>>
>>> at
>>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:125)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:305)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:263)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:207)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)
>>>
>>> at java.security.AccessController.doPrivileged(Native Method)
>>>
>>> at javax.security.auth.Subject.doAs(Subject.java:422)
>>>
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>>>
>>> at
>>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>>>
>>> at
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)
>>>
>>> ... 2 more
>>>
>>> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
>>> Could not find a file system implementation for scheme 'hdfs'. The scheme
>>> is not directly supported by Flink and no Hadoop file system to support
>>> this scheme could be loaded.
>>>
>>> at
>>> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:450)
>>>
>>> at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362)
>>>
>>> at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
>>>
>>> at
>>> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:100)
>>>
>>> ... 13 more
>>>
>>> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
>>> Cannot support file system for 'hdfs' via Hadoop, because Hadoop is not in
>>> the classpath, or some classes are missing from the classpath.
>>>
>>> at
>>> org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:184)
>>>
>>> at
>>> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:446)
>>>
>>> ... 16 more
>>>
>>> Caused by: java.lang.VerifyError: Bad return type
>>>
>>> Exception Details:
>>>
>>>   Location:
>>>
>>>
>>> org/apache/hadoop/hdfs/DFSClient.getQuotaUsage(Ljava/lang/String;)Lorg/apache/hadoop/fs/QuotaUsage;
>>> @160: areturn
>>>
>>>   Reason:
>>>
>>> Type 'org/apache/hadoop/fs/ContentSummary' (current frame, stack[0])
>>> is not assignable to 'org/apache/hadoop/fs/QuotaUsage' (from method
>>> signature)
>>>
>>>   Current Frame:
>>>
>>> bci: @160
>>>
>>> flags: { }
>>>
>>> locals: { 'org/apache/hadoop/hdfs/DFSClient', 'java/lang/String',
>>> 'org/apache/hadoop/ipc/RemoteException', 'java/io/IOException' }
>>> stack: { 'org/apache/hadoop/fs/ContentSummary' }
>>>
>>>
>>>
>>> 在本地intellij idea中可以正常运行,flink job上订阅kafka,sink到mysql和hbase,集群flink lib目录下,
>>>
>>>
>>>
>>>


Re:Re: flink sql job 提交到yarn上报错

2020-06-16 文章 Zhou Zach




high-availability.storageDir: hdfs:///flink/ha/
high-availability.zookeeper.quorum: cdh1:2181,cdh2:2181,cdh3:2181
state.backend: filesystem
state.checkpoints.dir: hdfs://nameservice1:8020//user/flink10/checkpoints
state.savepoints.dir: hdfs://nameservice1:8020//user/flink10/savepoints
high-availability.zookeeper.path.root: /flink

















在 2020-06-16 14:44:02,"王松"  写道:
>你的配置文件中ha配置可以贴下吗
>
>Zhou Zach  于2020年6月16日周二 下午1:49写道:
>
>> org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to
>> initialize the cluster entrypoint YarnJobClusterEntrypoint.
>>
>> at
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)
>>
>> at
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
>>
>> at
>> org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)
>>
>> Caused by: java.io.IOException: Could not create FileSystem for highly
>> available storage path (hdfs:/flink/ha/application_1592215995564_0027)
>>
>> at
>> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:103)
>>
>> at
>> org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:89)
>>
>> at
>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:125)
>>
>> at
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:305)
>>
>> at
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:263)
>>
>> at
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:207)
>>
>> at
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>>
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>>
>> at
>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>>
>> at
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)
>>
>> ... 2 more
>>
>> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
>> Could not find a file system implementation for scheme 'hdfs'. The scheme
>> is not directly supported by Flink and no Hadoop file system to support
>> this scheme could be loaded.
>>
>> at
>> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:450)
>>
>> at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362)
>>
>> at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
>>
>> at
>> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:100)
>>
>> ... 13 more
>>
>> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
>> Cannot support file system for 'hdfs' via Hadoop, because Hadoop is not in
>> the classpath, or some classes are missing from the classpath.
>>
>> at
>> org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:184)
>>
>> at
>> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:446)
>>
>> ... 16 more
>>
>> Caused by: java.lang.VerifyError: Bad return type
>>
>> Exception Details:
>>
>>   Location:
>>
>>
>> org/apache/hadoop/hdfs/DFSClient.getQuotaUsage(Ljava/lang/String;)Lorg/apache/hadoop/fs/QuotaUsage;
>> @160: areturn
>>
>>   Reason:
>>
>> Type 'org/apache/hadoop/fs/ContentSummary' (current frame, stack[0])
>> is not assignable to 'org/apache/hadoop/fs/QuotaUsage' (from method
>> signature)
>>
>>   Current Frame:
>>
>> bci: @160
>>
>> flags: { }
>>
>> locals: { 'org/apache/hadoop/hdfs/DFSClient', 'java/lang/String',
>> 'org/apache/hadoop/ipc/RemoteException', 'java/io/IOException' }
>> stack: { 'org/apache/hadoop/fs/ContentSummary' }
>>
>>
>>
>> 在本地intellij idea中可以正常运行,flink job上订阅kafka,sink到mysql和hbase,集群flink lib目录下,
>>
>>
>>
>>


Re:flink sql job 提交到yarn上报错

2020-06-16 文章 Zhou Zach
将flink-shaded-hadoop-2-3.0.0-cdh6.3.0-7.0.jar放在flink/lib目录下,或者打入fat jar都不起作用。。。
















At 2020-06-16 13:49:27, "Zhou Zach"  wrote:

org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to 
initialize the cluster entrypoint YarnJobClusterEntrypoint.

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)

at 
org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)

Caused by: java.io.IOException: Could not create FileSystem for highly 
available storage path (hdfs:/flink/ha/application_1592215995564_0027)

at 
org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:103)

at 
org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:89)

at 
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:125)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:305)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:263)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:207)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)

at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)

... 2 more

Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could 
not find a file system implementation for scheme 'hdfs'. The scheme is not 
directly supported by Flink and no Hadoop file system to support this scheme 
could be loaded.

at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:450)

at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362)

at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)

at 
org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:100)

... 13 more

Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: 
Cannot support file system for 'hdfs' via Hadoop, because Hadoop is not in the 
classpath, or some classes are missing from the classpath.

at 
org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:184)

at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:446)

... 16 more

Caused by: java.lang.VerifyError: Bad return type

Exception Details:

  Location:


org/apache/hadoop/hdfs/DFSClient.getQuotaUsage(Ljava/lang/String;)Lorg/apache/hadoop/fs/QuotaUsage;
 @160: areturn

  Reason:

Type 'org/apache/hadoop/fs/ContentSummary' (current frame, stack[0]) is not 
assignable to 'org/apache/hadoop/fs/QuotaUsage' (from method signature)

  Current Frame:

bci: @160

flags: { }

locals: { 'org/apache/hadoop/hdfs/DFSClient', 'java/lang/String', 
'org/apache/hadoop/ipc/RemoteException', 'java/io/IOException' }

stack: { 'org/apache/hadoop/fs/ContentSummary' }






在本地intellij idea中可以正常运行,flink job上订阅kafka,sink到mysql和hbase,集群flink lib目录下,




 

flink sql job 提交到yarn上报错

2020-06-15 文章 Zhou Zach
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to 
initialize the cluster entrypoint YarnJobClusterEntrypoint.

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)

at 
org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)

Caused by: java.io.IOException: Could not create FileSystem for highly 
available storage path (hdfs:/flink/ha/application_1592215995564_0027)

at 
org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:103)

at 
org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:89)

at 
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:125)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:305)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:263)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:207)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)

at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)

at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)

... 2 more

Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could 
not find a file system implementation for scheme 'hdfs'. The scheme is not 
directly supported by Flink and no Hadoop file system to support this scheme 
could be loaded.

at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:450)

at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362)

at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)

at 
org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:100)

... 13 more

Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: 
Cannot support file system for 'hdfs' via Hadoop, because Hadoop is not in the 
classpath, or some classes are missing from the classpath.

at 
org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:184)

at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:446)

... 16 more

Caused by: java.lang.VerifyError: Bad return type

Exception Details:

  Location:


org/apache/hadoop/hdfs/DFSClient.getQuotaUsage(Ljava/lang/String;)Lorg/apache/hadoop/fs/QuotaUsage;
 @160: areturn

  Reason:

Type 'org/apache/hadoop/fs/ContentSummary' (current frame, stack[0]) is not 
assignable to 'org/apache/hadoop/fs/QuotaUsage' (from method signature)

  Current Frame:

bci: @160

flags: { }

locals: { 'org/apache/hadoop/hdfs/DFSClient', 'java/lang/String', 
'org/apache/hadoop/ipc/RemoteException', 'java/io/IOException' }

stack: { 'org/apache/hadoop/fs/ContentSummary' }






在本地intellij idea中可以正常运行,flink job上订阅kafka,sink到mysql和hbase,集群flink lib目录下,

Re:Re:Re: flink sql 怎样将从hbase中取出的BYTES类型转换成Int

2020-06-15 文章 Zhou Zach
hbase中维表:


streamTableEnv.sqlUpdate(
"""
|
|CREATE TABLE user_hbase3(
|rowkey string,
|cf ROW(sex VARCHAR, age INT, created_time TIMESTAMP(3))
|) WITH (
|'connector.type' = 'hbase',
|'connector.version' = '2.1.0',
|'connector.table-name' = 'user_hbase3',
|'connector.zookeeper.quorum' = 'cdh1:2181,cdh2:2181,cdh3:2181',
|'connector.zookeeper.znode.parent' = '/hbase',
|'connector.write.buffer-flush.max-size' = '10mb',
|'connector.write.buffer-flush.max-rows' = '1000',
|'connector.write.buffer-flush.interval' = '2s'
|)
|""".stripMargin)

















At 2020-06-15 20:19:22, "Zhou Zach"  wrote:
>val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
>val blinkEnvSettings = 
>EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
>val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv, 
>blinkEnvSettings)
>
>val conf = new Configuration
>val users = new HBaseTableSource(conf, "user_hbase3")
>users.setRowKey("rowkey", classOf[String]) // currency as the primary key
>users.addColumn("cf", "age", classOf[Array[Byte]])
>
>streamTableEnv.registerTableSource("users", users)
>
>
>streamTableEnv.sqlUpdate(
>"""
>|
>|CREATE TABLE user_behavior (
>|uid VARCHAR,
>|phoneType VARCHAR,
>|clickCount INT,
>|`time` TIMESTAMP(3)
>|) WITH (
>|'connector.type' = 'kafka',
>|'connector.version' = 'universal',
>|'connector.topic' = 'user_behavior',
>|'connector.startup-mode' = 'earliest-offset',
>|'connector.properties.0.key' = 'zookeeper.connect',
>|'connector.properties.0.value' = 'cdh1:2181,cdh2:2181,cdh3:2181',
>|'connector.properties.1.key' = 'bootstrap.servers',
>|'connector.properties.1.value' = 'cdh1:9092,cdh2:9092,cdh3:9092',
>|'update-mode' = 'append',
>|'format.type' = 'json',
>|'format.derive-schema' = 'true'
>|)
>|""".stripMargin)
>
>streamTableEnv.sqlUpdate(
>"""
>|
>|CREATE TABLE user_cnt (
>|`time` VARCHAR,
>|sum_age INT
>|) WITH (
>|'connector.type' = 'jdbc',
>|'connector.url' = 'jdbc:mysql://localhost:3306/dashboard',
>|'connector.table' = 'user_cnt',
>|'connector.username' = 'root',
>|'connector.password' = '123456',
>|'connector.write.flush.max-rows' = '1'
>|)
>|""".stripMargin)
>
>
>streamTableEnv.sqlUpdate(
>"""
>|
>|insert into  user_cnt
>|SELECT
>|  cast(b.`time` as string) as `time`,  u.age
>|FROM
>|  (select * , PROCTIME() AS proctime from user_behavior) AS b
>|  JOIN users FOR SYSTEM_TIME AS OF b.`proctime` AS u
>    |  ON b.uid = u.rowkey
>|
>|""".stripMargin)
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2020-06-15 20:01:16,"Leonard Xu"  写道:
>>Hi,
>>看起来是你query的 schema 和 table (sink) 的schema 没有对应上,hbase中的数据都是bytes存储,在 flink 
>>sql 中一般不需要读取bytes,读取到的数据应该是 FLINK SQL对应的类型,如 int, bigint,string等,方便把你的 SQL 
>>贴下吗?
>>
>>祝好,
>>Leonard Xu
>>
>>> 在 2020年6月15日,19:55,Zhou Zach  写道:
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
>>> Field types of query result and registered TableSink 
>>> default_catalog.default_database.user_cnt do not match.
>>> Query schema: [time: STRING, age: BYTES]
>>> Sink schema: [time: STRING, sum_age: INT]


Re:Re: flink sql 怎样将从hbase中取出的BYTES类型转换成Int

2020-06-15 文章 Zhou Zach
val streamExecutionEnv = StreamExecutionEnvironment.getExecutionEnvironment
val blinkEnvSettings = 
EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
val streamTableEnv = StreamTableEnvironment.create(streamExecutionEnv, 
blinkEnvSettings)

val conf = new Configuration
val users = new HBaseTableSource(conf, "user_hbase3")
users.setRowKey("rowkey", classOf[String]) // currency as the primary key
users.addColumn("cf", "age", classOf[Array[Byte]])

streamTableEnv.registerTableSource("users", users)


streamTableEnv.sqlUpdate(
"""
|
|CREATE TABLE user_behavior (
|uid VARCHAR,
|phoneType VARCHAR,
|clickCount INT,
|`time` TIMESTAMP(3)
|) WITH (
|'connector.type' = 'kafka',
|'connector.version' = 'universal',
|'connector.topic' = 'user_behavior',
|'connector.startup-mode' = 'earliest-offset',
|'connector.properties.0.key' = 'zookeeper.connect',
|'connector.properties.0.value' = 'cdh1:2181,cdh2:2181,cdh3:2181',
|'connector.properties.1.key' = 'bootstrap.servers',
|'connector.properties.1.value' = 'cdh1:9092,cdh2:9092,cdh3:9092',
|'update-mode' = 'append',
|'format.type' = 'json',
|'format.derive-schema' = 'true'
|)
|""".stripMargin)

streamTableEnv.sqlUpdate(
"""
|
|CREATE TABLE user_cnt (
|`time` VARCHAR,
|sum_age INT
|) WITH (
|'connector.type' = 'jdbc',
|'connector.url' = 'jdbc:mysql://localhost:3306/dashboard',
|'connector.table' = 'user_cnt',
|'connector.username' = 'root',
|'connector.password' = '123456',
|'connector.write.flush.max-rows' = '1'
|)
|""".stripMargin)


streamTableEnv.sqlUpdate(
"""
|
|insert into  user_cnt
|SELECT
|  cast(b.`time` as string) as `time`,  u.age
|FROM
|  (select * , PROCTIME() AS proctime from user_behavior) AS b
|  JOIN users FOR SYSTEM_TIME AS OF b.`proctime` AS u
|  ON b.uid = u.rowkey
|
|""".stripMargin)

















在 2020-06-15 20:01:16,"Leonard Xu"  写道:
>Hi,
>看起来是你query的 schema 和 table (sink) 的schema 没有对应上,hbase中的数据都是bytes存储,在 flink sql 
>中一般不需要读取bytes,读取到的数据应该是 FLINK SQL对应的类型,如 int, bigint,string等,方便把你的 SQL 贴下吗?
>
>祝好,
>Leonard Xu
>
>> 在 2020年6月15日,19:55,Zhou Zach  写道:
>> 
>> 
>> 
>> 
>> 
>> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
>> Field types of query result and registered TableSink 
>> default_catalog.default_database.user_cnt do not match.
>> Query schema: [time: STRING, age: BYTES]
>> Sink schema: [time: STRING, sum_age: INT]


flink sql 怎样将从hbase中取出的BYTES类型转换成Int

2020-06-15 文章 Zhou Zach




Exception in thread "main" org.apache.flink.table.api.ValidationException: 
Field types of query result and registered TableSink 
default_catalog.default_database.user_cnt do not match.
Query schema: [time: STRING, age: BYTES]
Sink schema: [time: STRING, sum_age: INT]

Re:Re: flink sql sink hbase failed

2020-06-15 文章 Zhou Zach
改了源码,可以了

















在 2020-06-15 16:17:46,"Leonard Xu"  写道:
>Hi
>
>
>> 在 2020年6月15日,15:36,Zhou Zach  写道:
>> 
>> 'connector.version' expects '1.4.3', but is '2.1.0'
>
>Hbase connector只支持1.4.3的版本,其他不支持,但之前看有社区用户用1.4.3的connector写入高版本的case,你可以试下。
>
>祝好
>Leonard Xu


flink sql sink hbase failed

2020-06-15 文章 Zhou Zach
flink version: 1.10.0
hbase version: 2.1.0




SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default 
configuration: logging only errors to the console.
Exception in thread "main" 
org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a 
suitable table factory for 'org.apache.flink.table.factories.TableSinkFactory' 
in
the classpath.


Reason: Required context properties mismatch.


The matching candidates:
org.apache.flink.addons.hbase.HBaseTableFactory
Mismatched properties:
'connector.version' expects '1.4.3', but is '2.1.0'


The following properties are requested:
connector.table-name=user_hbase
connector.type=hbase
connector.version=2.1.0
connector.write.buffer-flush.interval=2s
connector.write.buffer-flush.max-rows=1000
connector.write.buffer-flush.max-size=10mb
connector.zookeeper.quorum=cdh1:2181,cdh2:2181,cdh3:2181
connector.zookeeper.znode.parent=/hbase
schema.0.data-type=VARCHAR(2147483647)
schema.0.name=rowkey
schema.1.data-type=ROW<`sex` VARCHAR(2147483647), `age` INT, `created_time` 
TIMESTAMP(3)>
schema.1.name=cf


The following factories have been considered:
org.apache.flink.api.java.io.jdbc.JDBCTableSourceSinkFactory
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory
org.apache.flink.table.sinks.CsvBatchTableSinkFactory
org.apache.flink.table.sinks.CsvAppendTableSinkFactory
org.apache.flink.addons.hbase.HBaseTableFactory
at 
org.apache.flink.table.factories.TableFactoryService.filterByContext(TableFactoryService.java:322)
at 
org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:190)
at 
org.apache.flink.table.factories.TableFactoryService.findSingleInternal(TableFactoryService.java:143)
at 
org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.java:96)
at 
org.apache.flink.table.planner.delegation.PlannerBase.getTableSink(PlannerBase.scala:310)
at 
org.apache.flink.table.planner.delegation.PlannerBase.translateToRel(PlannerBase.scala:190)
at 
org.apache.flink.table.planner.delegation.PlannerBase$$anonfun$1.apply(PlannerBase.scala:150)
at 
org.apache.flink.table.planner.delegation.PlannerBase$$anonfun$1.apply(PlannerBase.scala:150)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:150)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:682)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:495)
at org.rabbit.sql.FromKafkaSinkHbase$.main(FromKafkaSinkHbase.scala:61)
at org.rabbit.sql.FromKafkaSinkHbase.main(FromKafkaSinkHbase.scala)






Query:


streamTableEnv.sqlUpdate(
"""
|
|CREATE TABLE user_hbase(
|rowkey string,
|cf ROW(sex VARCHAR, age INT, created_time TIMESTAMP(3))
|) WITH (
|'connector.type' = 'hbase',
|'connector.version' = '2.1.0',
|'connector.table-name' = 'user_hbase',
|'connector.zookeeper.quorum' = 'cdh1:2181,cdh2:2181,cdh3:2181',
|'connector.zookeeper.znode.parent' = '/hbase',
|'connector.write.buffer-flush.max-size' = '10mb',
|'connector.write.buffer-flush.max-rows' = '1000',
|'connector.write.buffer-flush.interval' = '2s'
|)
|""".stripMargin)

flink sql DDL支持 Temporal Table 定义吗

2020-06-14 文章 Zhou Zach
根据文档https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/streaming/joins.html#join-with-a-temporal-table,
临时表table source 必须要继承LookupableTableSource,
但是,看到https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/connect.html#jdbc-connector


-- lookup options, optional, used in temporary 
join'connector.lookup.cache.max-rows'='5000',-- optional, max number of rows of 
lookup cache, over this value, the oldest rows will-- be eliminated. 
"cache.max-rows" and "cache.ttl" options must all be specified if any-- of them 
is specified. Cache is not enabled as 
default.'connector.lookup.cache.ttl'='10s',-- optional, the max time to live 
for each rows in lookup cache, over this time, the oldest rows-- will be 
expired. "cache.max-rows" and "cache.ttl" options must all be specified if any 
of-- them is specified. Cache is not enabled as 
default.'connector.lookup.max-retries'='3',-- optional, max retry times if 
lookup database failed


是不是说在flink sql JDBC Connector DDL中,加上这三个配置项,那么创建的表就是Temporal Table,可以在temporary 
join 中使用?





Re:回复: flink sql Temporal table join failed

2020-06-12 文章 Zhou Zach
好的

















在 2020-06-12 17:46:22,"咖啡泡油条" <9329...@qq.com> 写道:
>可以参考之前的邮件列表
>https://lists.apache.org/thread.html/r951ca3dfa24598b2c90f9d2172d5228c4689b8a710d7dc119055c5d3%40%3Cuser-zh.flink.apache.org%3E
>
>
>
>
>--原始邮件--
>发件人:"Leonard Xu"发送时间:2020年6月12日(星期五) 下午5:43
>收件人:"user-zh"
>主题:Re: flink sql Temporal table join failed
>
>
>
>
>你刚好踩到了这个坑,这是flink保留的关键字(time)转义的bug,1.10.1及之后的版本(即将发布的1.11)中修复了的。
>
>祝好
>Leonard Xu
>
> 在 2020年6月12日,17:38,Zhou Zach  
> 
> 
> 
> 是的,1.10.0版本
> 
> 
> 
> 
> 
> 
> 
> 
> 在 2020-06-12 16:28:15,"Benchao Li"  看起来你又踩到了一个坑,你用的是1.10.0吧?可以切换到1.10.1试一下,有两个bug已经在1.10.1中修复了。
> 
> Zhou Zach  
> 还是不行,
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> 
>[jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> 
>[jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for 
>an
> explanation.
> SLF4J: Actual binding is of type
> [org.apache.logging.slf4j.Log4jLoggerFactory]
> ERROR StatusLogger No log4j2 configuration file found. Using 
>default
> configuration: logging only errors to the console.
> Exception in thread "main" 
>org.apache.flink.table.api.SqlParserException:
> SQL parse failed. Encountered "time FROM" at line 1, column 44.
> Was expecting one of:
> "CURSOR" ...
> "EXISTS" ...
> "NOT" ...
> "ROW" ...
> "(" ...
> "+" ...
> "-" ...
>"TRUE" ...
> "FALSE" ...
> "UNKNOWN" ...
> "NULL" ...
>"DATE" ...
> "TIME"  "TIMESTAMP" ...
> "INTERVAL" ...
> "?" ...
> "CAST" ...
> "EXTRACT" ...
> "POSITION" ...
> "CONVERT" ...
> "TRANSLATE" ...
> "OVERLAY" ...
> "FLOOR" ...
> "CEIL" ...
> "CEILING" ...
> "SUBSTRING" ...
> "TRIM" ...
> "CLASSIFIER" ...
> "MATCH_NUMBER" ...
> "RUNNING" ...
> "PREV" ...
> "NEXT" ...
> "JSON_EXISTS" ...
> "JSON_VALUE" ...
> "JSON_QUERY" ...
> "JSON_OBJECT" ...
> "JSON_OBJECTAGG" ...
> "JSON_ARRAY" ...
> "JSON_ARRAYAGG" ...
>  "MULTISET" ...
> "ARRAY" ...
> "MAP" ...
> "PERIOD" ...
> "SPECIFIC" ...
>  "ABS" ...
> "AVG" ...
> "CARDINALITY" ...
> "CHAR_LENGTH" ...
> "CHARACTER_LENGTH" ...
> "COALESCE" ...
> "COLLECT" ...
> "COVAR_POP" ...
> "COVAR_SAMP" ...
> "CUME_DIST" ...
> "COUNT" ...
> "CURRENT_DATE" ...
> "CURRENT_TIME" ...
> "CURRENT_TIMESTAMP" ...
> "DENSE_RANK" ...
> "ELEMENT" ...
> "EXP" ...
> "FIRST_VALUE" ...
> "FUSION" ...
> "GROUPING" ...
> "HOUR" ...
> "LAG" ...
> "LEAD" ...
> "LEFT" ...
> "LAST_VALUE" ...
> "LN" ...
> "LOCALTIME" ...
> "LOCALTIMESTAMP" ...
> "LOWER" ...
> "MAX" ...
> "MIN" ...
> "MINUTE" ...
> "MOD" ...
> "MONTH" ...
> "NTH_VALUE" ...
> "NTILE" ...
> "NULLIF" ...
> "OCTET_LENGTH" ...
> "PERCENT_RANK" ...
> "POWER" ...
> "RANK" ...
> "REGR_COUNT" ...
> "REGR_SXX" ...
> "REGR_SYY" ...
> "RIGHT" ...
> "ROW_NUMBER" ...
> "SECOND" ...
> "SQRT" ...
> "STDDEV_POP" ...
> "STDDEV_SAMP" ...
> "SUM" ...
> "UPPER" ...
> "TRUNCATE" ...
> "USER" ...
> "VAR_POP" ...
> "VAR_SAMP" ...
> "YEAR" ...
> "CURRENT_CATALOG" ...
> "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
> "CURRENT_PATH" ...
> "CURRENT_ROLE" ...
> "CURRENT_SCHEMA" ...
> "CURRENT_USER" ...
> "SESSION_USER" ...
>

Re:Re: flink sql Temporal table join failed

2020-06-12 文章 Zhou Zach
感谢提醒

















在 2020-06-12 17:43:20,"Leonard Xu"  写道:
>
>你刚好踩到了这个坑,这是flink保留的关键字(time)转义的bug,1.10.1及之后的版本(即将发布的1.11)中修复了的。
>
>祝好
>Leonard Xu
>
>> 在 2020年6月12日,17:38,Zhou Zach  写道:
>> 
>> 
>> 
>> 
>> 是的,1.10.0版本
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 在 2020-06-12 16:28:15,"Benchao Li"  写道:
>>> 看起来你又踩到了一个坑,你用的是1.10.0吧?可以切换到1.10.1试一下,有两个bug已经在1.10.1中修复了。
>>> 
>>> Zhou Zach  于2020年6月12日周五 下午3:47写道:
>>> 
>>>> 还是不行,
>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>> SLF4J: Found binding in
>>>> [jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: Found binding in
>>>> [jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>>> explanation.
>>>> SLF4J: Actual binding is of type
>>>> [org.apache.logging.slf4j.Log4jLoggerFactory]
>>>> ERROR StatusLogger No log4j2 configuration file found. Using default
>>>> configuration: logging only errors to the console.
>>>> Exception in thread "main" org.apache.flink.table.api.SqlParserException:
>>>> SQL parse failed. Encountered "time FROM" at line 1, column 44.
>>>> Was expecting one of:
>>>>"CURSOR" ...
>>>>"EXISTS" ...
>>>>"NOT" ...
>>>>"ROW" ...
>>>>"(" ...
>>>>"+" ...
>>>>"-" ...
>>>> ...
>>>> ...
>>>> ...
>>>> ...
>>>> ...
>>>> ...
>>>> ...
>>>>"TRUE" ...
>>>>"FALSE" ...
>>>>"UNKNOWN" ...
>>>>"NULL" ...
>>>> ...
>>>> ...
>>>> ...
>>>>"DATE" ...
>>>>"TIME"  ...
>>>>"TIMESTAMP" ...
>>>>"INTERVAL" ...
>>>>"?" ...
>>>>"CAST" ...
>>>>"EXTRACT" ...
>>>>"POSITION" ...
>>>>"CONVERT" ...
>>>>"TRANSLATE" ...
>>>>"OVERLAY" ...
>>>>"FLOOR" ...
>>>>"CEIL" ...
>>>>"CEILING" ...
>>>>"SUBSTRING" ...
>>>>"TRIM" ...
>>>>"CLASSIFIER" ...
>>>>"MATCH_NUMBER" ...
>>>>"RUNNING" ...
>>>>"PREV" ...
>>>>"NEXT" ...
>>>>"JSON_EXISTS" ...
>>>>"JSON_VALUE" ...
>>>>"JSON_QUERY" ...
>>>>"JSON_OBJECT" ...
>>>>"JSON_OBJECTAGG" ...
>>>>"JSON_ARRAY" ...
>>>>"JSON_ARRAYAGG" ...
>>>> ...
>>>>"MULTISET" ...
>>>>"ARRAY" ...
>>>>"MAP" ...
>>>>"PERIOD" ...
>>>>"SPECIFIC" ...
>>>> ...
>>>> ...
>>>> ...
>>>> ...
>>>> ...
>>>>"ABS" ...
>>>>"AVG" ...
>>>>"CARDINALITY" ...
>>>>"CHAR_LENGTH" ...
>>>>"CHARACTER_LENGTH" ...
>>>>"COALESCE" ...
>>>>"COLLECT" ...
>>>>"COVAR_POP" ...
>>>>"COVAR_SAMP" ...
>>>>"CUME_DIST" ...
>>>>"COUNT" ...
>>>>"CURRENT_DATE" ...
>>>>"CURRENT_TIME" ...
>>>>"CURRENT_TIMESTAMP" ...
>>>>"DENSE_RANK" ...
>>>>"ELEMENT" ...
>>>>"EXP" ...
>>>>"FIRST_VALUE" ...
>>>>"FUSION" ...
>>>>"GROUPING" ...
>>>>"HOUR" ...
>>>>"L

Re:Re: Re: Re: flink sql Temporal table join failed

2020-06-12 文章 Zhou Zach



是的,1.10.0版本








在 2020-06-12 16:28:15,"Benchao Li"  写道:
>看起来你又踩到了一个坑,你用的是1.10.0吧?可以切换到1.10.1试一下,有两个bug已经在1.10.1中修复了。
>
>Zhou Zach  于2020年6月12日周五 下午3:47写道:
>
>> 还是不行,
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type
>> [org.apache.logging.slf4j.Log4jLoggerFactory]
>> ERROR StatusLogger No log4j2 configuration file found. Using default
>> configuration: logging only errors to the console.
>> Exception in thread "main" org.apache.flink.table.api.SqlParserException:
>> SQL parse failed. Encountered "time FROM" at line 1, column 44.
>> Was expecting one of:
>> "CURSOR" ...
>> "EXISTS" ...
>> "NOT" ...
>> "ROW" ...
>> "(" ...
>> "+" ...
>> "-" ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>> "TRUE" ...
>> "FALSE" ...
>> "UNKNOWN" ...
>> "NULL" ...
>>  ...
>>  ...
>>  ...
>> "DATE" ...
>> "TIME"  ...
>> "TIMESTAMP" ...
>> "INTERVAL" ...
>> "?" ...
>> "CAST" ...
>> "EXTRACT" ...
>> "POSITION" ...
>> "CONVERT" ...
>> "TRANSLATE" ...
>> "OVERLAY" ...
>> "FLOOR" ...
>> "CEIL" ...
>> "CEILING" ...
>> "SUBSTRING" ...
>> "TRIM" ...
>> "CLASSIFIER" ...
>> "MATCH_NUMBER" ...
>> "RUNNING" ...
>> "PREV" ...
>> "NEXT" ...
>> "JSON_EXISTS" ...
>> "JSON_VALUE" ...
>> "JSON_QUERY" ...
>> "JSON_OBJECT" ...
>> "JSON_OBJECTAGG" ...
>> "JSON_ARRAY" ...
>> "JSON_ARRAYAGG" ...
>>  ...
>> "MULTISET" ...
>> "ARRAY" ...
>> "MAP" ...
>> "PERIOD" ...
>> "SPECIFIC" ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>> "ABS" ...
>> "AVG" ...
>> "CARDINALITY" ...
>> "CHAR_LENGTH" ...
>> "CHARACTER_LENGTH" ...
>> "COALESCE" ...
>> "COLLECT" ...
>> "COVAR_POP" ...
>> "COVAR_SAMP" ...
>> "CUME_DIST" ...
>> "COUNT" ...
>> "CURRENT_DATE" ...
>> "CURRENT_TIME" ...
>> "CURRENT_TIMESTAMP" ...
>> "DENSE_RANK" ...
>> "ELEMENT" ...
>> "EXP" ...
>> "FIRST_VALUE" ...
>> "FUSION" ...
>> "GROUPING" ...
>> "HOUR" ...
>> "LAG" ...
>> "LEAD" ...
>> "LEFT" ...
>> "LAST_VALUE" ...
>> "LN" ...
>> "LOCALTIME" ...
>> "LOCALTIMESTAMP" ...
>> "LOWER" ...
>> "MAX" ...
>> "MIN" ...
>> "MINUTE" ...
>> "MOD" ...
>> "MONTH" ...
>> "NTH_VALUE" ...
>> "NTILE" ...
>> "NULLIF" ...
>> "OCTET_LENGTH" ...
>> "PERCENT_RANK" ...
>> "POWER" ...
>> "RANK" ...
>> "REGR_COUNT" ...
>> "REGR_SXX" ...
>> "REGR_SYY" ...
>> "RIGHT" ...
>> "ROW_NUMBER" ...
>> "SECOND" ...
>> "SQRT" ...
>> "STDDEV_POP" ...
>> "STDDEV_SAMP" ...
>> "SUM" ...
>>   

Re:Re: Re: flink sql Temporal table join failed

2020-06-12 文章 Zhou Zach
el.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:646)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:627)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3181)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:563)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:148)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:135)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.toQueryOperation(SqlToOperationConverter.java:522)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlQuery(SqlToOperationConverter.java:436)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:154)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlInsert(SqlToOperationConverter.java:342)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:142)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:66)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)
at 
org.rabbit.sql.FromKafkaSinkMysqlForReal$.main(FromKafkaSinkMysqlForReal.scala:63)
at 
org.rabbit.sql.FromKafkaSinkMysqlForReal.main(FromKafkaSinkMysqlForReal.scala)


query:


streamTableEnv.sqlUpdate(
"""
|
|CREATE TABLE user_behavior (
|uid VARCHAR,
|phoneType VARCHAR,
    |clickCount INT,
|proctime AS PROCTIME(),
|`time` TIMESTAMP(3)
|) WITH (
|'connector.type' = 'kafka',
|'connector.version' = 'universal',
|'connector.topic' = 'user_behavior',
|'connector.startup-mode' = 'earliest-offset',
|'connector.properties.0.key' = 'zookeeper.connect',
|'connector.properties.0.value' = 'cdh1:2181,cdh2:2181,cdh3:2181',
|'connector.properties.1.key' = 'bootstrap.servers',
|'connector.properties.1.value' = 'cdh1:9092,cdh2:9092,cdh3:9092',
|'update-mode' = 'append',
|'format.type' = 'json',
|'format.derive-schema' = 'true'
|)
|""".stripMargin)
streamTableEnv.sqlUpdate(
"""
|
|insert into  user_cnt
|SELECT
|  cast(b.`time` as string), u.age
|FROM
|  user_behavior AS b
|  JOIN users FOR SYSTEM_TIME AS OF b.`proctime` AS u
|  ON b.uid = u.uid
|
|""".stripMargin)






不过,PROCTIME() AS proctime 放在select 后面可以执行成功,proctime AS PROCTIME() 放在select 
后面也不行。








在 2020-06-12 15:29:49,"Benchao Li"  写道:
>你写反了,是proctime AS PROCTIME()。
>计算列跟普通query里面的AS是反着的。
>
>Zhou Zach  于2020年6月12日周五 下午2:24写道:
>
>> flink 1.10.0:
>> 在create table中,加PROCTIME() AS proctime字段报错
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-06-12 14:08:11,"Benchao Li"  写道:
>> >Hi,
>> >
>> >Temporal Table join的时候需要是处理时间,你现在这个b.`time`是一个普通的时间戳,而不是事件时间。
>> >可以参考下[1]
>> >
>> >[1]
>> >
>> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/time_attributes.html
>> >
>> >Zhou Zach  于2020年6月12日周五 下午1:33写道:
>> >
>> >> SLF4J: Class path contains multiple SLF4J bindings.
>> >>
>> >> SLF4J: Found binding in
>> >>
>> [jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> >>
>> >> SLF4J: Found binding in
>> >>
>> [jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> >>
>> >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> >> explanation.
>> >>
>> >> SLF4J: Actual binding is of type
>> >> [org.apache.logging.slf4j.Log4jLoggerFactory]
>> >>
>> >> ERROR StatusLogger No log4j2 configuration file found. Using default
>> >> configuration: logging only errors to the console.
>> >>
>> >> Exception in thread "main" org.apache.flink.table.api.TableException:
>> >> Cannot generate a valid execution plan for the given query:
>> >>
>> >>
>> >>
>> >>
>> >> FlinkLogicalSink(name=[`default_catalog`.`default_database`.`user_cnt`],
>> >> fields=[time, sum_age])
>> >>
>> >> +- FlinkLogicalCalc(select

Re:Re: flink sql Temporal table join failed

2020-06-12 文章 Zhou Zach
flink 1.10.0:
在create table中,加PROCTIME() AS proctime字段报错

















在 2020-06-12 14:08:11,"Benchao Li"  写道:
>Hi,
>
>Temporal Table join的时候需要是处理时间,你现在这个b.`time`是一个普通的时间戳,而不是事件时间。
>可以参考下[1]
>
>[1]
>https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/time_attributes.html
>
>Zhou Zach  于2020年6月12日周五 下午1:33写道:
>
>> SLF4J: Class path contains multiple SLF4J bindings.
>>
>> SLF4J: Found binding in
>> [jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>
>> SLF4J: Found binding in
>> [jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>>
>> SLF4J: Actual binding is of type
>> [org.apache.logging.slf4j.Log4jLoggerFactory]
>>
>> ERROR StatusLogger No log4j2 configuration file found. Using default
>> configuration: logging only errors to the console.
>>
>> Exception in thread "main" org.apache.flink.table.api.TableException:
>> Cannot generate a valid execution plan for the given query:
>>
>>
>>
>>
>> FlinkLogicalSink(name=[`default_catalog`.`default_database`.`user_cnt`],
>> fields=[time, sum_age])
>>
>> +- FlinkLogicalCalc(select=[CAST(time) AS EXPR$0, age])
>>
>>+- FlinkLogicalJoin(condition=[=($0, $2)], joinType=[inner])
>>
>>   :- FlinkLogicalCalc(select=[uid, time])
>>
>>   :  +- FlinkLogicalTableSourceScan(table=[[default_catalog,
>> default_database, user_behavior, source: [KafkaTableSource(uid, phoneType,
>> clickCount, time)]]], fields=[uid, phoneType, clickCount, time])
>>
>>   +- FlinkLogicalSnapshot(period=[$cor0.time])
>>
>>  +- FlinkLogicalCalc(select=[uid, age])
>>
>> +- FlinkLogicalTableSourceScan(table=[[default_catalog,
>> default_database, users, source: [MysqlAsyncLookupTableSource(uid, sex,
>> age, created_time)]]], fields=[uid, sex, age, created_time])
>>
>>
>>
>>
>> Temporal table join currently only supports 'FOR SYSTEM_TIME AS OF' left
>> table's proctime field, doesn't support 'PROCTIME()'
>>
>> Please check the documentation for the set of currently supported SQL
>> features.
>>
>> at
>> org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:78)
>>
>> at
>> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62)
>>
>> at
>> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58)
>>
>> at
>> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>>
>> at
>> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>>
>> at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>>
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>>
>> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>>
>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>>
>> at
>> scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
>>
>> at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
>>
>> at
>> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:57)
>>
>> at
>> org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:170)
>>
>> at
>> org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:90)
>>
>> at
>> org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77)
>>
>> at
>> org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:248)
>>
>> at
>> org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:151)
>>
>> at
>> org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:682)
>>
>> at
>> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnviro

flink sql Temporal table join failed

2020-06-11 文章 Zhou Zach
SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in 
[jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in 
[jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

ERROR StatusLogger No log4j2 configuration file found. Using default 
configuration: logging only errors to the console.

Exception in thread "main" org.apache.flink.table.api.TableException: Cannot 
generate a valid execution plan for the given query: 




FlinkLogicalSink(name=[`default_catalog`.`default_database`.`user_cnt`], 
fields=[time, sum_age])

+- FlinkLogicalCalc(select=[CAST(time) AS EXPR$0, age])

   +- FlinkLogicalJoin(condition=[=($0, $2)], joinType=[inner])

  :- FlinkLogicalCalc(select=[uid, time])

  :  +- FlinkLogicalTableSourceScan(table=[[default_catalog, 
default_database, user_behavior, source: [KafkaTableSource(uid, phoneType, 
clickCount, time)]]], fields=[uid, phoneType, clickCount, time])

  +- FlinkLogicalSnapshot(period=[$cor0.time])

 +- FlinkLogicalCalc(select=[uid, age])

+- FlinkLogicalTableSourceScan(table=[[default_catalog, 
default_database, users, source: [MysqlAsyncLookupTableSource(uid, sex, age, 
created_time)]]], fields=[uid, sex, age, created_time])




Temporal table join currently only supports 'FOR SYSTEM_TIME AS OF' left 
table's proctime field, doesn't support 'PROCTIME()'

Please check the documentation for the set of currently supported SQL features.

at 
org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:78)

at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62)

at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58)

at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)

at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)

at scala.collection.Iterator$class.foreach(Iterator.scala:891)

at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)

at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)

at scala.collection.AbstractIterable.foreach(Iterable.scala:54)

at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)

at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)

at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:57)

at 
org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:170)

at 
org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:90)

at 
org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77)

at 
org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:248)

at 
org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:151)

at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:682)

at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:495)

at 
org.rabbit.sql.FromKafkaSinkMysqlForReal$.main(FromKafkaSinkMysqlForReal.scala:90)

at 
org.rabbit.sql.FromKafkaSinkMysqlForReal.main(FromKafkaSinkMysqlForReal.scala)

Caused by: org.apache.flink.table.api.TableException: Temporal table join 
currently only supports 'FOR SYSTEM_TIME AS OF' left table's proctime field, 
doesn't support 'PROCTIME()'

at 
org.apache.flink.table.planner.plan.rules.physical.common.CommonLookupJoinRule$class.matches(CommonLookupJoinRule.scala:67)

at 
org.apache.flink.table.planner.plan.rules.physical.common.BaseSnapshotOnCalcTableScanRule.matches(CommonLookupJoinRule.scala:147)

at 
org.apache.flink.table.planner.plan.rules.physical.common.BaseSnapshotOnCalcTableScanRule.matches(CommonLookupJoinRule.scala:161)

at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.matchRecurse(VolcanoRuleCall.java:263)

at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.matchRecurse(VolcanoRuleCall.java:370)

at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.matchRecurse(VolcanoRuleCall.java:370)

at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.matchRecurse(VolcanoRuleCall.java:370)

at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.matchRecurse(VolcanoRuleCall.java:370)

at 

Re:Re: flink TableEnvironment can not call getTableEnvironment api

2020-06-11 文章 Zhou Zach
感谢回复,不过,根据文档
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/joins.html
只能用Blink planner吧

















在 2020-06-12 11:49:08,"Leonard Xu"  写道:
>Hi,
>这个文档应该是1.9时更新漏掉了,我建个issue修复下。现在可以这样使用[1]:
>StreamExecutionEnvironment env = 
>StreamExecutionEnvironment.getExecutionEnvironment();
>EnvironmentSettings envSettings = EnvironmentSettings.newInstance()
>.useOldPlanner()
>.inStreamingMode()
>.build();
>StreamTableEnvironment tableEnvironment = StreamTableEnvironment.create(env, 
>envSettings);
>
>
>Best,
>Leonard Xu
>[1] 
>https://ci.apache.org/projects/flink/flink-docs-master/dev/table/common.html#create-a-tableenvironment
> 
><https://ci.apache.org/projects/flink/flink-docs-master/dev/table/common.html#create-a-tableenvironment>
>
>> 在 2020年6月12日,11:39,Zhou Zach  写道:
>> 
>> 
>> 
>> flink version 1.10.0
>> 
>> 
>> 根据文档
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/temporal_tables.html#defining-temporal-table
>> 想要Defining Temporal Table,但是没有发现getTableEnvironment。。
>> 
>> 
>> val env = StreamExecutionEnvironment.getExecutionEnvironment
>> val tEnv = TableEnvironment.getTableEnvironment(env)
>


flink TableEnvironment can not call getTableEnvironment api

2020-06-11 文章 Zhou Zach


flink version 1.10.0


根据文档
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/temporal_tables.html#defining-temporal-table
想要Defining Temporal Table,但是没有发现getTableEnvironment。。


val env = StreamExecutionEnvironment.getExecutionEnvironment
val tEnv = TableEnvironment.getTableEnvironment(env)

Re:Re: flink sql bigint cannot be cast to mysql Long

2020-06-11 文章 Zhou Zach
3ku

















在 2020-06-11 14:10:53,"Leonard Xu"  写道:
>Hi, 
>
>JDBC connector 之前不支持 unsigned 类型,unsigned 会比signed 类型更长。
>bigint(20) unsigned(range is 0 to 18446744073709551615) 超过了  bigint (range is 
>-9223372036854775808 to 9223372036854775807)的长度。
>
>
>最新的代码已经修复这个问题了[1],你可以等1.11发布后试用,或者编译下最新的代码,flink 中对应表 声明decimal(20, 0)处理。 
>
>祝好,
>Leonard Xu
>
>[1]  https://issues.apache.org/jira/browse/FLINK-17657 
><https://issues.apache.org/jira/browse/FLINK-17657>
>
>> 在 2020年6月11日,13:51,Zhou Zach  写道:
>> 
>> bigint(20) unsigned
>


Re:Re:flink sql bigint cannot be cast to mysql Long

2020-06-10 文章 Zhou Zach









项目里引用的mysql:

   mysql
   mysql-connector-java
   5.1.46

使用的Mysql版本是5.7.18-log
如果mysql里面的字段是bigint,建表转换成int吗,会有截断风险吧








At 2020-06-11 13:39:18, "chaojianok"  wrote:
>检查一下你项目里引入的 MySQL 包的版本是否和你使用的 MySQL 版本一致,或者也可以直接转换一下数据类型。
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>At 2020-06-11 13:22:07, "Zhou Zach"  wrote:
>>SLF4J: Class path contains multiple SLF4J bindings.
>>
>>SLF4J: Found binding in 
>>[jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>
>>SLF4J: Found binding in 
>>[jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>
>>SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
>>explanation.
>>
>>SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
>>
>>ERROR StatusLogger No log4j2 configuration file found. Using default 
>>configuration: logging only errors to the console.
>>
>>Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>>server's identity verification is not recommended. According to MySQL 
>>5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>>by default if explicit option isn't set. For compliance with existing 
>>applications not using SSL the verifyServerCertificate property is set to 
>>'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>>or set useSSL=true and provide truststore for server certificate verification.
>>
>>Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>>server's identity verification is not recommended. According to MySQL 
>>5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>>by default if explicit option isn't set. For compliance with existing 
>>applications not using SSL the verifyServerCertificate property is set to 
>>'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>>or set useSSL=true and provide truststore for server certificate verification.
>>
>>Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>>server's identity verification is not recommended. According to MySQL 
>>5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>>by default if explicit option isn't set. For compliance with existing 
>>applications not using SSL the verifyServerCertificate property is set to 
>>'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>>or set useSSL=true and provide truststore for server certificate verification.
>>
>>Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>>server's identity verification is not recommended. According to MySQL 
>>5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>>by default if explicit option isn't set. For compliance with existing 
>>applications not using SSL the verifyServerCertificate property is set to 
>>'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>>or set useSSL=true and provide truststore for server certificate verification.
>>
>>Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>>server's identity verification is not recommended. According to MySQL 
>>5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>>by default if explicit option isn't set. For compliance with existing 
>>applications not using SSL the verifyServerCertificate property is set to 
>>'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>>or set useSSL=true and provide truststore for server certificate verification.
>>
>>Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>>server's identity verification is not recommended. According to MySQL 
>>5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>>by default if explicit option isn't set. For compliance with existing 
>>applications not using SSL the verifyServerCertificate property is set to 
>>'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>>or set useSSL=true and provide truststore for server certificate verification.
>>
>>Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>>server's identity verification is not recommended. According to MySQL 
>>5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>>by default if expli

Re:Re: flink sql bigint cannot be cast to mysql Long

2020-06-10 文章 Zhou Zach
flink版本是1.10.0,
mysql表:
CREATE TABLE `analysis_gift_consume` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `times` int(8) NOT NULL COMMENT '时间[MMdd]',
  `gid` int(4) NOT NULL DEFAULT '0' COMMENT '礼物ID',
  `gname` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT 
'礼物名称',
  `counts` bigint(20) NOT NULL DEFAULT '0' COMMENT '',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 
COLLATE=utf8mb4_unicode_ci COMMENT='';




CREATE TABLE `analysis_gift_consume1` (

  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,

  `times` int(8) NOT NULL COMMENT '时间[MMdd]',

  `gid` int(4) NOT NULL DEFAULT '0' COMMENT '礼物ID',

  `gname` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT 
'礼物名称',

  `counts` bigint(20) NOT NULL DEFAULT '0' COMMENT '',

  PRIMARY KEY (`id`)

) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 
COLLATE=utf8mb4_unicode_ci COMMENT='';




数据库的字段是 bigint 类型,总有场景在mysql的字段设置为bigint吧,如果mysql的字段为bigint,那在创建flink 
sql时,用什么类型合适呢





At 2020-06-11 13:42:11, "Leonard Xu"  wrote:
>Hi,
>用的 flink 版本是多少? 数据库的字段确定是 bigint 类型吗?
>> Caused by: java.lang.ClassCastException: java.math.BigInteger cannot be cast 
>> to java.lang.Long
>
>java.math.BigInteger 的范围比 java.lang.Long的范围大很多,是不能cast的,应该是你数据类型对应错误了,可以把mysql 
>表的schema贴下吗?
>
>
>祝好,
>Leonard Xu
>
>> 在 2020年6月11日,13:22,Zhou Zach  写道:
>> 
>> SLF4J: Class path contains multiple SLF4J bindings.
>> 
>> SLF4J: Found binding in 
>> [jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> 
>> SLF4J: Found binding in 
>> [jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> 
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
>> explanation.
>> 
>> SLF4J: Actual binding is of type 
>> [org.apache.logging.slf4j.Log4jLoggerFactory]
>> 
>> ERROR StatusLogger No log4j2 configuration file found. Using default 
>> configuration: logging only errors to the console.
>> 
>> Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>> server's identity verification is not recommended. According to MySQL 
>> 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>> by default if explicit option isn't set. For compliance with existing 
>> applications not using SSL the verifyServerCertificate property is set to 
>> 'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>> or set useSSL=true and provide truststore for server certificate 
>> verification.
>> 
>> Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>> server's identity verification is not recommended. According to MySQL 
>> 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>> by default if explicit option isn't set. For compliance with existing 
>> applications not using SSL the verifyServerCertificate property is set to 
>> 'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>> or set useSSL=true and provide truststore for server certificate 
>> verification.
>> 
>> Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>> server's identity verification is not recommended. According to MySQL 
>> 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>> by default if explicit option isn't set. For compliance with existing 
>> applications not using SSL the verifyServerCertificate property is set to 
>> 'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>> or set useSSL=true and provide truststore for server certificate 
>> verification.
>> 
>> Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>> server's identity verification is not recommended. According to MySQL 
>> 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>> by default if explicit option isn't set. For compliance with existing 
>> applications not using SSL the verifyServerCertificate property is set to 
>> 'false'. You need either to explicitly disable SSL by setting useSSL=false, 
>> or set useSSL=true and provide truststore for server certificate 
>> verification.
>> 
>> Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without 
>> server's identity verification is not recommended. According to MySQL 
>> 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established 
>> by default if explici

flink sql bigint cannot be cast to mysql Long

2020-06-10 文章 Zhou Zach
SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in 
[jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in 
[jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

ERROR StatusLogger No log4j2 configuration file found. Using default 
configuration: logging only errors to the console.

Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Thu Jun 11 13:18:18 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Thu Jun 11 13:18:19 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to 

Re:回复: sink mysql 失败

2020-06-10 文章 Zhou Zach
感谢回复!忘记设置用户名和密码了。。

















At 2020-06-10 16:54:43, "wangweigu...@stevegame.cn"  
wrote:
>
>Caused by: java.sql.SQLException: Access denied for user ''@'localhost' (using 
>password: NO)
>得指定下有操作mysql这个表的权限账号了!
>
>
> 
>发件人: Zhou Zach
>发送时间: 2020-06-10 16:32
>收件人: Flink user-zh mailing list
>主题: sink mysql 失败
>SLF4J: Class path contains multiple SLF4J bindings.
> 
>SLF4J: Found binding in 
>[jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> 
>SLF4J: Found binding in 
>[jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> 
>SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
>explanation.
> 
>SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 
>ERROR StatusLogger No log4j2 configuration file found. Using default 
>configuration: logging only errors to the console.
> 
>Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without 
>server's identity verification is not recommended. According to MySQL 5.5.45+, 
>5.6.26+ and 5.7.6+ requirements SSL connection must be established by default 
>if explicit option isn't set. For compliance with existing applications not 
>using SSL the verifyServerCertificate property is set to 'false'. You need 
>either to explicitly disable SSL by setting useSSL=false, or set useSSL=true 
>and provide truststore for server certificate verification.
> 
>Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without 
>server's identity verification is not recommended. According to MySQL 5.5.45+, 
>5.6.26+ and 5.7.6+ requirements SSL connection must be established by default 
>if explicit option isn't set. For compliance with existing applications not 
>using SSL the verifyServerCertificate property is set to 'false'. You need 
>either to explicitly disable SSL by setting useSSL=false, or set useSSL=true 
>and provide truststore for server certificate verification.
> 
>Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without 
>server's identity verification is not recommended. According to MySQL 5.5.45+, 
>5.6.26+ and 5.7.6+ requirements SSL connection must be established by default 
>if explicit option isn't set. For compliance with existing applications not 
>using SSL the verifyServerCertificate property is set to 'false'. You need 
>either to explicitly disable SSL by setting useSSL=false, or set useSSL=true 
>and provide truststore for server certificate verification.
> 
>Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without 
>server's identity verification is not recommended. According to MySQL 5.5.45+, 
>5.6.26+ and 5.7.6+ requirements SSL connection must be established by default 
>if explicit option isn't set. For compliance with existing applications not 
>using SSL the verifyServerCertificate property is set to 'false'. You need 
>either to explicitly disable SSL by setting useSSL=false, or set useSSL=true 
>and provide truststore for server certificate verification.
> 
>Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without 
>server's identity verification is not recommended. According to MySQL 5.5.45+, 
>5.6.26+ and 5.7.6+ requirements SSL connection must be established by default 
>if explicit option isn't set. For compliance with existing applications not 
>using SSL the verifyServerCertificate property is set to 'false'. You need 
>either to explicitly disable SSL by setting useSSL=false, or set useSSL=true 
>and provide truststore for server certificate verification.
> 
>Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without 
>server's identity verification is not recommended. According to MySQL 5.5.45+, 
>5.6.26+ and 5.7.6+ requirements SSL connection must be established by default 
>if explicit option isn't set. For compliance with existing applications not 
>using SSL the verifyServerCertificate property is set to 'false'. You need 
>either to explicitly disable SSL by setting useSSL=false, or set useSSL=true 
>and provide truststore for server certificate verification.
> 
>Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without 
>server's identity verification is not recommended. According to MySQL 5.5.45+, 
>5.6.26+ and 5.7.6+ requirements SSL connection must be established by default 
>if explicit option isn't set. For compliance with existing applications not 
>using SSL the verifyServerCertificate property is set to 'false'. You need 
>either to explicitly disable SSL by setting useSSL=false, or set useSSL=true 
>and provide truststore for server certificate verification.

sink mysql 失败

2020-06-10 文章 Zhou Zach
SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in 
[jar:file:/Users/Zach/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in 
[jar:file:/Users/Zach/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

ERROR StatusLogger No log4j2 configuration file found. Using default 
configuration: logging only errors to the console.

Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Wed Jun 10 16:27:09 CST 2020 WARN: Establishing SSL connection without server's 
identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ 
and 5.7.6+ requirements SSL connection must be established by default if 
explicit option isn't set. For compliance with existing applications not using 
SSL the verifyServerCertificate property is set to 'false'. You need either to 
explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide 
truststore for server certificate verification.

Exception in thread "main" java.util.concurrent.ExecutionException: 
org.apache.flink.client.program.ProgramInvocationException: Job 

Re:Re: flink sql 消费kafka失败

2020-06-09 文章 Zhou Zach
感谢回复,写入Kafka的时间戳改成"2020-06-10T12:12:43Z",消费成功了

















在 2020-06-10 13:25:01,"Leonard Xu"  写道:
>Hi,
>
>> Caused by: java.io.IOException: Failed to deserialize JSON object.
>
>报错信息说了是 json 解析失败了,按照之前大家踩的坑,请检查下两点:
>(1)json 中timestamp数据的格式必须是"2020-06-10T12:12:43Z", 不能是 long 
>型的毫秒,社区已有issue跟进,还未解决
>(2)kafka 对应topic 检查下是否有脏数据,“earliest-offset’” 会从topic的第一条数据开始消费
>
>祝好
>Leonard Xu


flink sql 消费kafka失败

2020-06-09 文章 Zhou Zach




Exception in thread "main" java.util.concurrent.ExecutionException: 
org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: 
994bd5a683143be23a23d77ed005d20d)
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1640)
at 
org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:74)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1620)
at 
org.apache.flink.table.planner.delegation.StreamExecutor.execute(StreamExecutor.java:42)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:643)
at org.rabbit.sql.FromKafkaSinkMysql$.main(FromKafkaSinkMysql.scala:66)
at org.rabbit.sql.FromKafkaSinkMysql.main(FromKafkaSinkMysql.scala)
Caused by: org.apache.flink.client.program.ProgramInvocationException: Job 
failed (JobID: 994bd5a683143be23a23d77ed005d20d)
at 
org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:112)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at 
org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:874)
at akka.dispatch.OnComplete.internal(Future.scala:264)
at akka.dispatch.OnComplete.internal(Future.scala:261)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at 
org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
at 
akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
at 
akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436)
at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at 
akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
at 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
at 
org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:110)
... 31 more
Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by 
NoRestartBackoffTimeStrategy
at 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:110)
at 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:76)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:192)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:186)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:180)
at 
org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:484)
at 
org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:380)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown 

Re:Re:Re:flink sink to mysql

2020-06-08 文章 Zhou Zach




code 代码乱码,重新截图一下:














在 2020-06-08 17:20:54,"Zhou Zach"  写道:
>
>
>
>使用JDBCOutputFormat的方式,一直没成功啊
>
>
>code:
>object FromKafkaSinkJdbcByJdbcOutputFormat { def main(args: Array[String]): 
>Unit = { val env = getEnv() val topic = "t4" val consumer = 
>getFlinkKafkaConsumer(topic) consumer.setStartFromLatest() val sourceStream = 
>env .addSource(consumer) .setParallelism(1) val mapDS = 
>sourceStream.map(message => { try { JSON.parseObject(message, 
>classOf[BehaviorData]) } catch { case _ => { println("read json failed") 
>BehaviorData("", "", "", 0) } } }) val rowDS = mapDS .map(behaviorData => { 
>println(s"behaviorData: ** $behaviorData") val row: Row = new 
>Row(6) row.setField(0, behaviorData.uid.getBytes("UTF-8")) row.setField(1, 
>behaviorData.time.getBytes("UTF-8")) row.setField(2, 
>behaviorData.phoneType.getBytes("UTF-8")) row.setField(3, 
>behaviorData.clickCount.intValue()) row }) rowDS.print() val sql = "query = 
>INSERT INTO user_behavior (uid, time, phoneType, clickCount) VALUES (?, ?, ?, 
>?)" val jdbcOutput = JDBCOutputFormat.buildJDBCOutputFormat() 
>.setDrivername("com.mysql.cj.jdbc.Driver") 
>.setDBUrl("jdbc:mysql://localhost:3306/dashboard?useUnicode=true=utf-8=true=true")
> .setUsername("root") .setPassword("") .setQuery(sql) // 
>.setSqlTypes(Array(Types.STRING, Types.STRING, Types.STRING, Types.LONG)) 
>.finish() rowDS.writeUsingOutputFormat( jdbcOutput ) env.execute() } }
>
>
>
>
>当注释掉sink代码:
>rowDS.writeUsingOutputFormat( jdbcOutput )
>可以看到打印的日志:
>behaviorData: ** 
>BehaviorData(6b67c8c700427dee7552f81f3228c927,1591607608894,iOS,7) 3> [54, 98, 
>54, 55, 99, 56, 99, 55, 48, 48, 52, 50, 55, 100, 101, 101, 55, 53, 53, 50, 
>102, 56, 49, 102, 51, 50, 50, 56, 99, 57, 50, 55],[49, 53, 57, 49, 54, 48, 55, 
>54, 48, 56, 56, 57, 52],[105, 79, 83],7,null,null behaviorData: 
>** 
>BehaviorData(a95f22eabc4fd4b580c011a3161a9d9d,1591607609394,iOS,1) 4> [97, 57, 
>53, 102, 50, 50, 101, 97, 98, 99, 52, 102, 100, 52, 98, 53, 56, 48, 99, 48, 
>49, 49, 97, 51, 49, 54, 49, 97, 57, 100, 57, 100],[49, 53, 57, 49, 54, 48, 55, 
>54, 48, 57, 51, 57, 52],[105, 79, 83],1,null,null
>
>
>不注释掉sink代码:
>rowDS.writeUsingOutputFormat( jdbcOutput )
>
>就看不到日志,是不是定义的jdbcOutput不对啊
>
>
>在 2020-06-03 19:16:47,"chaojianok"  写道:
>>推荐 JDBCOutputFormat 吧,简单易用。
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>在 2020-06-03 18:11:38,"Zhou Zach"  写道:
>>>hi all,
>>> flink sink to mysql 是推荐 继承RichSinkFunction的方式,还是 通过JDBCOutputFormat的方式?


Re:Re:flink sink to mysql

2020-06-08 文章 Zhou Zach



使用JDBCOutputFormat的方式,一直没成功啊


code:
object FromKafkaSinkJdbcByJdbcOutputFormat { def main(args: Array[String]): 
Unit = { val env = getEnv() val topic = "t4" val consumer = 
getFlinkKafkaConsumer(topic) consumer.setStartFromLatest() val sourceStream = 
env .addSource(consumer) .setParallelism(1) val mapDS = 
sourceStream.map(message => { try { JSON.parseObject(message, 
classOf[BehaviorData]) } catch { case _ => { println("read json failed") 
BehaviorData("", "", "", 0) } } }) val rowDS = mapDS .map(behaviorData => { 
println(s"behaviorData: ** $behaviorData") val row: Row = new 
Row(6) row.setField(0, behaviorData.uid.getBytes("UTF-8")) row.setField(1, 
behaviorData.time.getBytes("UTF-8")) row.setField(2, 
behaviorData.phoneType.getBytes("UTF-8")) row.setField(3, 
behaviorData.clickCount.intValue()) row }) rowDS.print() val sql = "query = 
INSERT INTO user_behavior (uid, time, phoneType, clickCount) VALUES (?, ?, ?, 
?)" val jdbcOutput = JDBCOutputFormat.buildJDBCOutputFormat() 
.setDrivername("com.mysql.cj.jdbc.Driver") 
.setDBUrl("jdbc:mysql://localhost:3306/dashboard?useUnicode=true=utf-8=true=true")
 .setUsername("root") .setPassword("") .setQuery(sql) // 
.setSqlTypes(Array(Types.STRING, Types.STRING, Types.STRING, Types.LONG)) 
.finish() rowDS.writeUsingOutputFormat( jdbcOutput ) env.execute() } }




当注释掉sink代码:
rowDS.writeUsingOutputFormat( jdbcOutput )
可以看到打印的日志:
behaviorData: ** 
BehaviorData(6b67c8c700427dee7552f81f3228c927,1591607608894,iOS,7) 3> [54, 98, 
54, 55, 99, 56, 99, 55, 48, 48, 52, 50, 55, 100, 101, 101, 55, 53, 53, 50, 102, 
56, 49, 102, 51, 50, 50, 56, 99, 57, 50, 55],[49, 53, 57, 49, 54, 48, 55, 54, 
48, 56, 56, 57, 52],[105, 79, 83],7,null,null behaviorData: ** 
BehaviorData(a95f22eabc4fd4b580c011a3161a9d9d,1591607609394,iOS,1) 4> [97, 57, 
53, 102, 50, 50, 101, 97, 98, 99, 52, 102, 100, 52, 98, 53, 56, 48, 99, 48, 49, 
49, 97, 51, 49, 54, 49, 97, 57, 100, 57, 100],[49, 53, 57, 49, 54, 48, 55, 54, 
48, 57, 51, 57, 52],[105, 79, 83],1,null,null


不注释掉sink代码:
rowDS.writeUsingOutputFormat( jdbcOutput )

就看不到日志,是不是定义的jdbcOutput不对啊


在 2020-06-03 19:16:47,"chaojianok"  写道:
>推荐 JDBCOutputFormat 吧,简单易用。
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2020-06-03 18:11:38,"Zhou Zach"  写道:
>>hi all,
>> flink sink to mysql 是推荐 继承RichSinkFunction的方式,还是 通过JDBCOutputFormat的方式?


flink sink to mysql

2020-06-03 文章 Zhou Zach
hi all,
 flink sink to mysql 是推荐 继承RichSinkFunction的方式,还是 通过JDBCOutputFormat的方式?

Re:Re: flink sql 写 hive分区表失败

2020-05-28 文章 Zhou Zach
回复的好详细!而且引出了相关的测试用例
Thanks very much!

















在 2020-05-28 14:23:33,"Leonard Xu"  写道:
> 
>>|INSERT INTO dwdCatalog.dwd.t1_copy partition (`p_year` = p_year, 
>> `p_month` = p_month)
>>|select id,name from dwdCatalog.dwd.t1 where `p_year` = 2020 and 
>> `p_month` = 4 
>
>动态分区不是这样指定的,和hive的语法是一样的,下面两种应该都可以,flink这边文档少了点,可以参考[1][2]
>
>INSERT INTO dwdCatalog.dwd.t1_copy 
> select id,name,`p_year`,`p_month` from dwdCatalog.dwd.t1 where `p_year` = 
> 2020 and `p_month` = 4 
>
>INSERT INTO dwdCatalog.dwd.t1_copy 
>select * from dwdCatalog.dwd.t1 where `p_year` = 2020 and `p_month` = 4 
>
>Best,
>Leonard Xu
>[1] 
>https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/insert.html#examples
> 
><https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/insert.html#examples>
>[2]  
>https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/TableEnvHiveConnectorTest.java#L294
> 
><https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/TableEnvHiveConnectorTest.java#L294>
>
>
>
>> 在 2020年5月28日,13:59,Zhou Zach  写道:
>> 
>> 多谢指点,可以了。
>> 但是换成动态插入,有问题:
>> org.apache.flink.client.program.ProgramInvocationException: The main method 
>> caused an error: SQL parse failed. Encountered "p_year" at line 3, column 58.
>> Was expecting one of:
>>"DATE" ...
>>"FALSE" ...
>>"INTERVAL" ...
>>"NULL" ...
>>"TIME" ...
>>"TIMESTAMP" ...
>>"TRUE" ...
>>"UNKNOWN" ...
>> ...
>> ...
>> ...
>> ...
>> ...
>> ...
>> ...
>> ...
>> ...
>> ...
>>"+" ...
>>"-" ...
>> 
>> 
>> at 
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
>> at 
>> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
>> at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
>> at 
>> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
>> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>> at 
>> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
>> at 
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>> at 
>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
>> 
>> 
>> 
>> 
>> Query:
>> tableEnv.sqlUpdate(
>>  """
>>    |
>>|INSERT INTO dwdCatalog.dwd.t1_copy partition (`p_year` = p_year, 
>> `p_month` = p_month)
>>|select id,name from dwdCatalog.dwd.t1 where `p_year` = 2020 and 
>> `p_month` = 4
>>|
>>|""".stripMargin)
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 在 2020-05-28 13:39:49,"Leonard Xu"  写道:
>>> Hi,
>>>>   |select * from dwdCatalog.dwd.t1 where `p_year` = 2020 and `p_month` 
>>>> = 5
>>> 
>>> 应该是 select * 会把分区字段一起带出来吧,你字段就不匹配了,select里加上你需要的字段吧
>>> 
>>> 祝好,
>>> Leonard Xu
>>> 
>>>> 在 2020年5月28日,12:57,Zhou Zach  写道:
>>>> 
>>>> org.apache.flink.client.program.ProgramInvocationException: The main 
>>>> method caused an error: Field types of query result and registered 
>>>> TableSink dwdCatalog.dwd.t1_copy do not match.
>>>> 
>>>> Query schema: [id: BIGINT, name: STRING, p_year: INT, p_month: INT, 
>>>> EXPR$4: INT NOT NULL, EXPR$5: INT NOT NULL]
>>>> 
>>>> Sink schema: [id: BIGINT, name: STRING, p_year: INT, p_month: INT]
>>>> 
>>>> at 
>>>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
>>>> 
>>>> at 
>>>> org.apache.flink.c

Re:Re: flink sql 写 hive分区表失败

2020-05-28 文章 Zhou Zach
多谢指点,可以了。
但是换成动态插入,有问题:
org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: SQL parse failed. Encountered "p_year" at line 3, column 58.
Was expecting one of:
"DATE" ...
"FALSE" ...
"INTERVAL" ...
"NULL" ...
"TIME" ...
"TIMESTAMP" ...
"TRUE" ...
"UNKNOWN" ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
"+" ...
"-" ...


at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)




Query:
tableEnv.sqlUpdate(
  """
|
|INSERT INTO dwdCatalog.dwd.t1_copy partition (`p_year` = p_year, 
`p_month` = p_month)
|select id,name from dwdCatalog.dwd.t1 where `p_year` = 2020 and 
`p_month` = 4
|
|""".stripMargin)

















在 2020-05-28 13:39:49,"Leonard Xu"  写道:
>Hi,
>>|select * from dwdCatalog.dwd.t1 where `p_year` = 2020 and `p_month` 
>> = 5
>
>应该是 select * 会把分区字段一起带出来吧,你字段就不匹配了,select里加上你需要的字段吧
> 
>祝好,
>Leonard Xu
>
>> 在 2020年5月28日,12:57,Zhou Zach  写道:
>> 
>> org.apache.flink.client.program.ProgramInvocationException: The main method 
>> caused an error: Field types of query result and registered TableSink 
>> dwdCatalog.dwd.t1_copy do not match.
>> 
>> Query schema: [id: BIGINT, name: STRING, p_year: INT, p_month: INT, EXPR$4: 
>> INT NOT NULL, EXPR$5: INT NOT NULL]
>> 
>> Sink schema: [id: BIGINT, name: STRING, p_year: INT, p_month: INT]
>> 
>> at 
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
>> 
>> at 
>> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
>> 
>> at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
>> 
>> at 
>> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
>> 
>> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>> 
>> at 
>> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
>> 
>> at 
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>> 
>> at java.security.AccessController.doPrivileged(Native Method)
>> 
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> 
>> at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>> 
>> at 
>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> 
>> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
>> 
>> 
>> 
>> 
>> hive分区表:
>> CREATE TABLE `dwd.t1`(
>>  `id` bigint, 
>>  `name` string)
>> PARTITIONED BY ( 
>>  `p_year` int, 
>>  `p_month` int)
>> 
>> 
>> CREATE TABLE `dwd.t1_copy`(
>>  `id` bigint, 
>>  `name` string)
>> PARTITIONED BY ( 
>>  `p_year` int, 
>>  `p_month` int)
>> 
>> 
>> Flink sql:
>> tableEnv.sqlUpdate(
>>  """
>>|
>>|INSERT INTO dwdCatalog.dwd.t1_copy partition (`p_year` = 2020, 
>> `p_month` = 5)
>>|select * from dwdCatalog.dwd.t1 where `p_year` = 2020 and `p_month` 
>> = 5
>>|
>>|""".stripMargin)
>> 
>> 
>> thanks for your help


flink sql 写 hive分区表失败

2020-05-27 文章 Zhou Zach
org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: Field types of query result and registered TableSink 
dwdCatalog.dwd.t1_copy do not match.

Query schema: [id: BIGINT, name: STRING, p_year: INT, p_month: INT, EXPR$4: INT 
NOT NULL, EXPR$5: INT NOT NULL]

Sink schema: [id: BIGINT, name: STRING, p_year: INT, p_month: INT]

at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)

at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)

at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)

at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)

at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)

at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)

at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)

at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)

at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)




hive分区表:
CREATE TABLE `dwd.t1`(
  `id` bigint, 
  `name` string)
PARTITIONED BY ( 
  `p_year` int, 
  `p_month` int)
  
  
CREATE TABLE `dwd.t1_copy`(
  `id` bigint, 
  `name` string)
PARTITIONED BY ( 
  `p_year` int, 
  `p_month` int)


Flink sql:
tableEnv.sqlUpdate(
  """
|
|INSERT INTO dwdCatalog.dwd.t1_copy partition (`p_year` = 2020, 
`p_month` = 5)
|select * from dwdCatalog.dwd.t1 where `p_year` = 2020 and `p_month` = 5
|
|""".stripMargin)


thanks for your help

Re:Re: Re: Re: Re: Flink sql 跨库

2020-05-27 文章 Zhou Zach
好的,感谢指点

















在 2020-05-27 19:33:42,"Rui Li"  写道:
>你是想要调试HiveCatalog的代码么?可以参考flink里的测试用例,我们有的测试是用embedded模式做的(比如HiveCatalogHiveMetadataTest),有些测试是单独起一个HMS进程(比如TableEnvHiveConnectorTest)。
>
>On Wed, May 27, 2020 at 7:27 PM Zhou Zach  wrote:
>
>> 是的,发现了,感谢指点。请教下,用intellij
>> idea调试,你是在本地调试吗,那样的话,要在本地搭建个hadoop集群吗,至少要搭建个本地的hive吧,还是直接用intellij
>> idea连接远程,如果集群在阿里云上,是不是要另外开端口的
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-05-27 19:19:58,"Rui Li"  写道:
>> >year在calcite里是保留关键字,你用`year`试试呢
>> >
>> >On Wed, May 27, 2020 at 7:09 PM Zhou Zach  wrote:
>> >
>> >> The program finished with the following exception:
>> >>
>> >>
>> >> org.apache.flink.client.program.ProgramInvocationException: The main
>> >> method caused an error: SQL parse failed. Encountered "year =" at line
>> 4,
>> >> column 51.
>> >> Was expecting one of:
>> >> "ARRAY" ...
>> >> "CASE" ...
>> >> "CURRENT" ...
>> >> "CURRENT_CATALOG" ...
>> >> "CURRENT_DATE" ...
>> >> "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
>> >> "CURRENT_PATH" ...
>> >> "CURRENT_ROLE" ...
>> >> "CURRENT_SCHEMA" ...
>> >> "CURRENT_TIME" ...
>> >> "CURRENT_TIMESTAMP" ...
>> >> "CURRENT_USER" ...
>> >> "DATE" ...
>> >> "EXISTS" ...
>> >> "FALSE" ...
>> >> "INTERVAL" ...
>> >> "LOCALTIME" ...
>> >> "LOCALTIMESTAMP" ...
>> >> "MULTISET" ...
>> >> "NEW" ...
>> >> "NEXT" ...
>> >> "NOT" ...
>> >> "NULL" ...
>> >> "PERIOD" ...
>> >> "SESSION_USER" ...
>> >> "SYSTEM_USER" ...
>> >> "TIME" ...
>> >> "TIMESTAMP" ...
>> >> "TRUE" ...
>> >> "UNKNOWN" ...
>> >> "USER" ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >> "?" ...
>> >> "+" ...
>> >> "-" ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >>  ...
>> >> "CAST" ...
>> >> "EXTRACT" ...
>> >> "POSITION" ...
>> >> "CONVERT" ...
>> >> "TRANSLATE" ...
>> >> "OVERLAY" ...
>> >> "FLOOR" ...
>> >> "CEIL" ...
>> >> "CEILING" ...
>> >> "SUBSTRING" ...
>> >> "TRIM" ...
>> >> "CLASSIFIER" ...
>> >> "MATCH_NUMBER" ...
>> >> "RUNNING" ...
>> >> "PREV" ...
>> >> "JSON_EXISTS" ...
>> >> "JSON_VALUE" ...
>> >> "JSON_QUERY" ...
>> >> "JSON_OBJECT" ...
>> >> "JSON_OBJECTAGG" ...
>> >> "JSON_ARRAY" ...
>> >> "JSON_ARRAYAGG" ...
>> >> "MAP" ...
>> >> "SPECIFIC" ...
>> >> "ABS" ...
>> >> "AVG" ...
>> >> "CARDINALITY" ...
>> >> "CHAR_LENGTH" ...
>> >> "CHARACTER_LENGTH" ...
>> >> "COALESCE" ...
>> >> "COLLECT" ...
>> >> "COVAR_POP" ...
>> >> "COVAR_SAMP" ...
>> >> "CUME_DIST" ...
>> >> "COUNT" ...
>> >> "DENSE_RANK" ...
>> >> "ELEMENT" ...
>> >> "EXP" ...
>> >>

Re:Re: Re: Re: Flink sql 跨库

2020-05-27 文章 Zhou Zach
是的,发现了,感谢指点。请教下,用intellij 
idea调试,你是在本地调试吗,那样的话,要在本地搭建个hadoop集群吗,至少要搭建个本地的hive吧,还是直接用intellij 
idea连接远程,如果集群在阿里云上,是不是要另外开端口的

















在 2020-05-27 19:19:58,"Rui Li"  写道:
>year在calcite里是保留关键字,你用`year`试试呢
>
>On Wed, May 27, 2020 at 7:09 PM Zhou Zach  wrote:
>
>> The program finished with the following exception:
>>
>>
>> org.apache.flink.client.program.ProgramInvocationException: The main
>> method caused an error: SQL parse failed. Encountered "year =" at line 4,
>> column 51.
>> Was expecting one of:
>> "ARRAY" ...
>> "CASE" ...
>> "CURRENT" ...
>> "CURRENT_CATALOG" ...
>> "CURRENT_DATE" ...
>> "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
>> "CURRENT_PATH" ...
>> "CURRENT_ROLE" ...
>> "CURRENT_SCHEMA" ...
>> "CURRENT_TIME" ...
>> "CURRENT_TIMESTAMP" ...
>> "CURRENT_USER" ...
>> "DATE" ...
>> "EXISTS" ...
>> "FALSE" ...
>> "INTERVAL" ...
>> "LOCALTIME" ...
>> "LOCALTIMESTAMP" ...
>> "MULTISET" ...
>> "NEW" ...
>> "NEXT" ...
>> "NOT" ...
>> "NULL" ...
>> "PERIOD" ...
>> "SESSION_USER" ...
>> "SYSTEM_USER" ...
>> "TIME" ...
>> "TIMESTAMP" ...
>> "TRUE" ...
>> "UNKNOWN" ...
>> "USER" ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>> "?" ...
>> "+" ...
>> "-" ...
>>  ...
>>  ...
>>  ...
>>  ...
>>  ...
>> "CAST" ...
>> "EXTRACT" ...
>> "POSITION" ...
>> "CONVERT" ...
>> "TRANSLATE" ...
>> "OVERLAY" ...
>> "FLOOR" ...
>> "CEIL" ...
>> "CEILING" ...
>> "SUBSTRING" ...
>> "TRIM" ...
>> "CLASSIFIER" ...
>> "MATCH_NUMBER" ...
>> "RUNNING" ...
>> "PREV" ...
>> "JSON_EXISTS" ...
>> "JSON_VALUE" ...
>> "JSON_QUERY" ...
>> "JSON_OBJECT" ...
>> "JSON_OBJECTAGG" ...
>> "JSON_ARRAY" ...
>> "JSON_ARRAYAGG" ...
>> "MAP" ...
>> "SPECIFIC" ...
>> "ABS" ...
>> "AVG" ...
>> "CARDINALITY" ...
>> "CHAR_LENGTH" ...
>> "CHARACTER_LENGTH" ...
>> "COALESCE" ...
>> "COLLECT" ...
>> "COVAR_POP" ...
>> "COVAR_SAMP" ...
>> "CUME_DIST" ...
>> "COUNT" ...
>> "DENSE_RANK" ...
>> "ELEMENT" ...
>> "EXP" ...
>> "FIRST_VALUE" ...
>> "FUSION" ...
>> "GROUPING" ...
>> "HOUR" ...
>> "LAG" ...
>> "LEAD" ...
>> "LEFT" ...
>> "LAST_VALUE" ...
>> "LN" ...
>> "LOWER" ...
>> "MAX" ...
>> "MIN" ...
>> "MINUTE" ...
>> "MOD" ...
>> "MONTH" ...
>> "NTH_VALUE" ...
>> "NTILE" ...
>> "NULLIF" ...
>> "OCTET_LENGTH" ...
>> "PERCENT_RANK" ...
>> "POWER" ...
>> "RANK" ...
>> "REGR_COUNT" ...
>> "REGR_SXX" ...
>> "REGR_SYY" ...
>> "RIGHT" ...
>> "ROW_NUMBER" ...
>> "SECOND" ...
>> "SQRT" ...
>> "STDDEV_POP" ...
>> "STDDEV_SAMP" ...
>> "SUM" ...
>> "UPPER" ...
>> "TRUNCATE" ...
>> "VAR_POP" ...
>> "VAR_SAMP" ...
>> "YEAR&q

Re:Re:Re: Re: Flink sql 跨库

2020-05-27 文章 Zhou Zach
找到原因了,flink 把year 当成关键字了

















At 2020-05-27 19:09:43, "Zhou Zach"  wrote:
>The program finished with the following exception:
>
>
>org.apache.flink.client.program.ProgramInvocationException: The main method 
>caused an error: SQL parse failed. Encountered "year =" at line 4, column 51.
>Was expecting one of:
>"ARRAY" ...
>"CASE" ...
>"CURRENT" ...
>"CURRENT_CATALOG" ...
>"CURRENT_DATE" ...
>"CURRENT_DEFAULT_TRANSFORM_GROUP" ...
>"CURRENT_PATH" ...
>"CURRENT_ROLE" ...
>"CURRENT_SCHEMA" ...
>"CURRENT_TIME" ...
>"CURRENT_TIMESTAMP" ...
>"CURRENT_USER" ...
>"DATE" ...
>"EXISTS" ...
>"FALSE" ...
>"INTERVAL" ...
>"LOCALTIME" ...
>"LOCALTIMESTAMP" ...
>"MULTISET" ...
>"NEW" ...
>"NEXT" ...
>"NOT" ...
>"NULL" ...
>"PERIOD" ...
>"SESSION_USER" ...
>"SYSTEM_USER" ...
>"TIME" ...
>"TIMESTAMP" ...
>"TRUE" ...
>"UNKNOWN" ...
>"USER" ...
> ...
> ...
> ...
> ...
> ...
> ...
> ...
> ...
> ...
> ...
> ...
>"?" ...
>"+" ...
>"-" ...
> ...
> ...
> ...
> ...
> ...
>"CAST" ...
>"EXTRACT" ...
>"POSITION" ...
>"CONVERT" ...
>"TRANSLATE" ...
>"OVERLAY" ...
>"FLOOR" ...
>"CEIL" ...
>"CEILING" ...
>"SUBSTRING" ...
>"TRIM" ...
>"CLASSIFIER" ...
>"MATCH_NUMBER" ...
>"RUNNING" ...
>"PREV" ...
>"JSON_EXISTS" ...
>"JSON_VALUE" ...
>"JSON_QUERY" ...
>"JSON_OBJECT" ...
>"JSON_OBJECTAGG" ...
>"JSON_ARRAY" ...
>"JSON_ARRAYAGG" ...
>"MAP" ...
>"SPECIFIC" ...
>"ABS" ...
>"AVG" ...
>"CARDINALITY" ...
>"CHAR_LENGTH" ...
>"CHARACTER_LENGTH" ...
>"COALESCE" ...
>"COLLECT" ...
>"COVAR_POP" ...
>"COVAR_SAMP" ...
>"CUME_DIST" ...
>"COUNT" ...
>"DENSE_RANK" ...
>"ELEMENT" ...
>"EXP" ...
>"FIRST_VALUE" ...
>"FUSION" ...
>"GROUPING" ...
>"HOUR" ...
>"LAG" ...
>"LEAD" ...
>"LEFT" ...
>"LAST_VALUE" ...
>"LN" ...
>"LOWER" ...
>"MAX" ...
>"MIN" ...
>"MINUTE" ...
>"MOD" ...
>"MONTH" ...
>"NTH_VALUE" ...
>"NTILE" ...
>"NULLIF" ...
>"OCTET_LENGTH" ...
>"PERCENT_RANK" ...
>"POWER" ...
>"RANK" ...
>"REGR_COUNT" ...
>"REGR_SXX" ...
>"REGR_SYY" ...
>"RIGHT" ...
>"ROW_NUMBER" ...
>"SECOND" ...
>"SQRT" ...
>"STDDEV_POP" ...
>"STDDEV_SAMP" ...
>"SUM" ...
>"UPPER" ...
>"TRUNCATE" ...
>"VAR_POP" ...
>"VAR_SAMP" ...
>"YEAR" ...
>"YEAR" "(" ...
>
>
>at 
>org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
>at 
>org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
>at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
>at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
>at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>at 
>org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
>at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
>org.apache.hadoop.security.UserGroupInformation.

Re:Re: Re: Flink sql 跨库

2020-05-27 文章 Zhou Zach
"CURRENT_TIMESTAMP" ...
"CURRENT_USER" ...
"DATE" ...
"EXISTS" ...
"FALSE" ...
"INTERVAL" ...
"LOCALTIME" ...
"LOCALTIMESTAMP" ...
"MULTISET" ...
"NEW" ...
"NEXT" ...
"NOT" ...
"NULL" ...
"PERIOD" ...
"SESSION_USER" ...
"SYSTEM_USER" ...
"TIME" ...
"TIMESTAMP" ...
"TRUE" ...
"UNKNOWN" ...
"USER" ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
"?" ...
"+" ...
"-" ...
 ...
 ...
 ...
 ...
 ...
"CAST" ...
"EXTRACT" ...
"POSITION" ...
"CONVERT" ...
"TRANSLATE" ...
"OVERLAY" ...
"FLOOR" ...
"CEIL" ...
"CEILING" ...
"SUBSTRING" ...
"TRIM" ...
"CLASSIFIER" ...
"MATCH_NUMBER" ...
"RUNNING" ...
"PREV" ...
"JSON_EXISTS" ...
"JSON_VALUE" ...
"JSON_QUERY" ...
"JSON_OBJECT" ...
"JSON_OBJECTAGG" ...
"JSON_ARRAY" ...
"JSON_ARRAYAGG" ...
"MAP" ...
"SPECIFIC" ...
"ABS" ...
"AVG" ...
"CARDINALITY" ...
"CHAR_LENGTH" ...
"CHARACTER_LENGTH" ...
"COALESCE" ...
"COLLECT" ...
"COVAR_POP" ...
"COVAR_SAMP" ...
"CUME_DIST" ...
"COUNT" ...
"DENSE_RANK" ...
"ELEMENT" ...
"EXP" ...
"FIRST_VALUE" ...
"FUSION" ...
"GROUPING" ...
"HOUR" ...
"LAG" ...
"LEAD" ...
"LEFT" ...
"LAST_VALUE" ...
"LN" ...
"LOWER" ...
"MAX" ...
"MIN" ...
"MINUTE" ...
"MOD" ...
"MONTH" ...
"NTH_VALUE" ...
"NTILE" ...
"NULLIF" ...
"OCTET_LENGTH" ...
"PERCENT_RANK" ...
"POWER" ...
"RANK" ...
"REGR_COUNT" ...
"REGR_SXX" ...
"REGR_SYY" ...
"RIGHT" ...
"ROW_NUMBER" ...
"SECOND" ...
"SQRT" ...
"STDDEV_POP" ...
"STDDEV_SAMP" ...
"SUM" ...
"UPPER" ...
"TRUNCATE" ...
"VAR_POP" ...
"VAR_SAMP" ...
"YEAR" ...
"YEAR" "(" ...


at 
org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:50)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:64)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)

















在 2020-05-27 19:08:09,"Rui Li"  写道:
>读hive分区表报的什么错啊,把stacktrace贴一下?
>
>On Wed, May 27, 2020 at 6:08 PM Zhou Zach  wrote:
>
>>
>>
>> hive partition table:
>>
>>
>> 1CREATE TABLE `dwd.bill`(
>> 2  `id` bigint,
>> 3  `gid` bigint,
>> 4  `count` bigint,
>> 5  `price` bigint,
>> 6  `srcuid` bigint,
>> 7  `srcnickname` string,
>> 8  `srcleftmoney` bigint,
>> 9  `srcwealth` bigint,
>> 10  `srccredit` decimal(10,0),
>> 11  `dstnickname` string,
>> 12  `dstuid` bigint,
>> 13  `familyid` int,
>> 14  `dstleftmoney` bigint,
>> 15  `dstwealth` bigint,
>> 16  `dstcredit` decimal(10,0),
>> 17  `addtime` bigint,
>> 18  `type` int,
>> 19  `getmoney` decimal(10,0),
>> 20  `os` int,
>> 21  `bak` string,
>> 22  `getbonus` decimal(10,0),
>> 23  `unionbonus` decimal(10,0))
>> 24PARTITIONED BY (
>> 25  `year` int,
>> 26  `month` int)
>> 27ROW FORMAT SERDE
>> 28  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>> 29STORED AS INPUTFORMAT
>> 30  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>> 31OUTPUTFORMAT
>> 32  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
>>
>>
>>
>>
>> Query:
>>
>>
>> tableEnv.sqlUpdate(
>>   """
>> |
>> |INSERT INTO catalog2.dwd.orders
>> |select srcuid, price from catalog2.dwd.bill where year = 2020
>> |
>> |
>> |""".stripMargin)
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-05-27 18:01:19,"Leonard Xu"  写道:
>> >Flink 支持hive分区表的,看你在另外一个邮件里贴了,你能把你的hive表和query在邮件里贴下吗?
>> >
>> >祝好
>> >Leonard Xu
>> >
>> >> 在 2020年5月27日,17:40,Zhou Zach  写道:
>> >>
>> >>
>> >>
>> >>
>> >> 感谢回复,表名前加上Catalog和db前缀可以成功访问了。
>> >> 现在遇到个问题,flink 读hive
>> 分区表时,如果where子句用分区键,比如year过滤就会报错,用表中其他字段过滤是没问题的,是flink 不支持 hive分区表,还是哪个地方没设置对
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 在 2020-05-27 17:33:11,"Leonard Xu"  写道:
>> >>> Hi,
>> >>>> 因为一个HiveCatalog只能关联一个库
>> >>> 一个Catalog是可以关联到多个db的,不同catalog,不同db中表都可以访问的.
>> >>>
>> >>> Flink SQL> show catalogs;
>> >>> default_catalog
>> >>> myhive
>> >>> Flink SQL> use catalog myhive;
>> >>> Flink SQL> show databases;
>> >>> default
>> >>> hive_test
>> >>> hive_test1
>> >>> Flink SQL> select * from hive_test.db2_table union select * from
>> myhive.hive_test1.db1_table;
>> >>> 2020-05-27 17:25:48,565 INFO  org.apache.hadoop.hive.conf.HiveConf
>> >>>
>> >>>
>> >>>
>> >>> 祝好
>> >>> Leonard Xu
>> >>>
>> >>>
>> >>>> 在 2020年5月27日,10:55,Zhou Zach  写道:
>> >>>>
>> >>>> hi all,
>> >>>> Flink sql 的HiveCatalog 是不是不能跨库操作啊,就是一个flink
>> sql中join的两个表涉及到两个不同到库,因为一个HiveCatalog只能关联一个库
>>
>
>
>-- 
>Best regards!
>Rui Li


Re:Re: Flink sql 跨库

2020-05-27 文章 Zhou Zach


hive partition table:


1CREATE TABLE `dwd.bill`(
2  `id` bigint, 
3  `gid` bigint, 
4  `count` bigint, 
5  `price` bigint, 
6  `srcuid` bigint, 
7  `srcnickname` string, 
8  `srcleftmoney` bigint, 
9  `srcwealth` bigint, 
10  `srccredit` decimal(10,0), 
11  `dstnickname` string, 
12  `dstuid` bigint, 
13  `familyid` int, 
14  `dstleftmoney` bigint, 
15  `dstwealth` bigint, 
16  `dstcredit` decimal(10,0), 
17  `addtime` bigint, 
18  `type` int, 
19  `getmoney` decimal(10,0), 
20  `os` int, 
21  `bak` string, 
22  `getbonus` decimal(10,0), 
23  `unionbonus` decimal(10,0))
24PARTITIONED BY ( 
25  `year` int, 
26  `month` int)
27ROW FORMAT SERDE 
28  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
29STORED AS INPUTFORMAT 
30  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
31OUTPUTFORMAT 
32  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'




Query:


tableEnv.sqlUpdate(
  """
|
|INSERT INTO catalog2.dwd.orders
|select srcuid, price from catalog2.dwd.bill where year = 2020
|
|
|""".stripMargin)

















在 2020-05-27 18:01:19,"Leonard Xu"  写道:
>Flink 支持hive分区表的,看你在另外一个邮件里贴了,你能把你的hive表和query在邮件里贴下吗?
>
>祝好
>Leonard Xu
>
>> 在 2020年5月27日,17:40,Zhou Zach  写道:
>> 
>> 
>> 
>> 
>> 感谢回复,表名前加上Catalog和db前缀可以成功访问了。
>> 现在遇到个问题,flink 读hive 分区表时,如果where子句用分区键,比如year过滤就会报错,用表中其他字段过滤是没问题的,是flink 
>> 不支持 hive分区表,还是哪个地方没设置对
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 在 2020-05-27 17:33:11,"Leonard Xu"  写道:
>>> Hi,
>>>> 因为一个HiveCatalog只能关联一个库
>>> 一个Catalog是可以关联到多个db的,不同catalog,不同db中表都可以访问的.
>>> 
>>> Flink SQL> show catalogs;
>>> default_catalog
>>> myhive
>>> Flink SQL> use catalog myhive;
>>> Flink SQL> show databases;
>>> default
>>> hive_test
>>> hive_test1
>>> Flink SQL> select * from hive_test.db2_table union select * from 
>>> myhive.hive_test1.db1_table;
>>> 2020-05-27 17:25:48,565 INFO  org.apache.hadoop.hive.conf.HiveConf
>>> 
>>> 
>>> 
>>> 祝好
>>> Leonard Xu
>>> 
>>> 
>>>> 在 2020年5月27日,10:55,Zhou Zach  写道:
>>>> 
>>>> hi all,
>>>> Flink sql 的HiveCatalog 是不是不能跨库操作啊,就是一个flink 
>>>> sql中join的两个表涉及到两个不同到库,因为一个HiveCatalog只能关联一个库


Re:Re: Flink sql 跨库

2020-05-27 文章 Zhou Zach



感谢回复,表名前加上Catalog和db前缀可以成功访问了。
现在遇到个问题,flink 读hive 分区表时,如果where子句用分区键,比如year过滤就会报错,用表中其他字段过滤是没问题的,是flink 不支持 
hive分区表,还是哪个地方没设置对














在 2020-05-27 17:33:11,"Leonard Xu"  写道:
>Hi,
>> 因为一个HiveCatalog只能关联一个库
>一个Catalog是可以关联到多个db的,不同catalog,不同db中表都可以访问的.
>
>Flink SQL> show catalogs;
>default_catalog
>myhive
>Flink SQL> use catalog myhive;
>Flink SQL> show databases;
>default
>hive_test
>hive_test1
>Flink SQL> select * from hive_test.db2_table union select * from 
>myhive.hive_test1.db1_table;
>2020-05-27 17:25:48,565 INFO  org.apache.hadoop.hive.conf.HiveConf
>
>
>
>祝好
>Leonard Xu
>
>
>> 在 2020年5月27日,10:55,Zhou Zach  写道:
>> 
>> hi all,
>> Flink sql 的HiveCatalog 是不是不能跨库操作啊,就是一个flink 
>> sql中join的两个表涉及到两个不同到库,因为一个HiveCatalog只能关联一个库


Flink read hive partition table failed

2020-05-26 文章 Zhou Zach
Flink version: 1.10.0
Flink sql read hive partition key failed,flink sql  是不是不支持hive 分区键


code:


   val settings = 
EnvironmentSettings.newInstance().useBlinkPlanner().inBatchMode().build()
val tableEnv = TableEnvironment.create(settings)


val hiveConfDir = "/etc/hive/conf" // a local path
val hiveVersion = "2.1.1"


val catalog2 = "catalog2"
val dwdDB = "dwd"
val dwdHiveCatalog = new HiveCatalog(catalog2, dwdDB, hiveConfDir, 
hiveVersion)
tableEnv.registerCatalog("catalog2", dwdHiveCatalog)


tableEnv.sqlUpdate(
  """
|
|INSERT INTO catalog2.dwd.orders
|select srcuid, price from catalog2.dwd.bill where year = 2020
|
|
|""".stripMargin)




tableEnv.execute("Flink-1.10 insert hive Table Testing")






The program finished with the following exception:


org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: SQL parse failed. Encountered "year =" at line 4, column 51.
Was expecting one of:
"ARRAY" ...
"CASE" ...
"CURRENT" ...
"CURRENT_CATALOG" ...
"CURRENT_DATE" ...
"CURRENT_DEFAULT_TRANSFORM_GROUP" ...
"CURRENT_PATH" ...
"CURRENT_ROLE" ...
"CURRENT_SCHEMA" ...
"CURRENT_TIME" ...
"CURRENT_TIMESTAMP" ...
"CURRENT_USER" ...
"DATE" ...
"EXISTS" ...
"FALSE" ...
"INTERVAL" ...
"LOCALTIME" ...
"LOCALTIMESTAMP" ...
"MULTISET" ...
"NEW" ...
"NEXT" ...
"NOT" ...
"NULL" ...
"PERIOD" ...
"SESSION_USER" ...
"SYSTEM_USER" ...
"TIME" ...
"TIMESTAMP" ...
"TRUE" ...
"UNKNOWN" ...
"USER" ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
 ...
"?" ...
"+" ...
"-" ...
 ...
 ...
 ...
 ...
 ...
"CAST" ...
"EXTRACT" ...
"POSITION" ...
"CONVERT" ...
"TRANSLATE" ...
"OVERLAY" ...
"FLOOR" ...
"CEIL" ...
"CEILING" ...
"SUBSTRING" ...
"TRIM" ...
"CLASSIFIER" ...
"MATCH_NUMBER" ...
"RUNNING" ...
"PREV" ...
"JSON_EXISTS" ...
"JSON_VALUE" ...
"JSON_QUERY" ...
"JSON_OBJECT" ...
"JSON_OBJECTAGG" ...
"JSON_ARRAY" ...
"JSON_ARRAYAGG" ...
"MAP" ...
"SPECIFIC" ...
"ABS" ...
"AVG" ...
"CARDINALITY" ...
"CHAR_LENGTH" ...
"CHARACTER_LENGTH" ...
"COALESCE" ...
"COLLECT" ...
"COVAR_POP" ...
"COVAR_SAMP" ...
"CUME_DIST" ...
"COUNT" ...
"DENSE_RANK" ...
"ELEMENT" ...
"EXP" ...
"FIRST_VALUE" ...
"FUSION" ...
"GROUPING" ...
"HOUR" ...
"LAG" ...
"LEAD" ...
"LEFT" ...
"LAST_VALUE" ...
"LN" ...
"LOWER" ...
"MAX" ...
"MIN" ...
"MINUTE" ...
"MOD" ...
"MONTH" ...
"NTH_VALUE" ...
"NTILE" ...
"NULLIF" ...
"OCTET_LENGTH" ...
"PERCENT_RANK" ...
"POWER" ...
"RANK" ...
"REGR_COUNT" ...
"REGR_SXX" ...
"REGR_SYY" ...
"RIGHT" ...
"ROW_NUMBER" ...
"SECOND" ...
"SQRT" ...
"STDDEV_POP" ...
"STDDEV_SAMP" ...
"SUM" ...
"UPPER" ...
"TRUNCATE" ...
"VAR_POP" ...
"VAR_SAMP" ...
"YEAR" ...
"YEAR" "(" ...


at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
Caused by: org.apache.flink.table.api.SqlParserException: SQL parse failed. 
Encountered "year =" at line 4, column 51.
Was expecting one of:
"ARRAY" ...
"CASE" ...
"CURRENT" ...
"CURRENT_CATALOG" ...
"CURRENT_DATE" ...
"CURRENT_DEFAULT_TRANSFORM_GROUP" ...
"CURRENT_PATH" ...
"CURRENT_ROLE" ...
"CURRENT_SCHEMA" ...
"CURRENT_TIME" ...
"CURRENT_TIMESTAMP" ...
"CURRENT_USER" ...
"DATE" ...
"EXISTS" ...
"FALSE" ...
"INTERVAL" ...
"LOCALTIME" ...
"LOCALTIMESTAMP" ...
"MULTISET" ...
"NEW" ...
"NEXT" ...
"NOT" ...
"NULL" ...
"PERIOD" ...
"SESSION_USER" ...
"SYSTEM_USER" ...
"TIME" ...
"TIMESTAMP" ...
"TRUE" ...
"UNKNOWN" ...
"USER" ...
 ...
 ...
 ...
 ...
 ...

Flink sql 跨库

2020-05-26 文章 Zhou Zach
hi all,
Flink sql 的HiveCatalog 是不是不能跨库操作啊,就是一个flink 
sql中join的两个表涉及到两个不同到库,因为一个HiveCatalog只能关联一个库