Re:Re: Table-api sql 预检查

2021-04-28 文章 Michael Ran
从代码逻辑里面肯定能抓出来,就是觉得这个预检查的功能可以作为API 开放出来
在 2021-04-29 12:29:26,"Shengkai Fang"  写道:
>Hi.
>
>可以看看 TableEnvironment#execute逻辑,如果能够将sql 转化成 `Transformation`,那么语法应该没有问题。
>
>Best,
>Shengkai
>
>Michael Ran  于2021年4月29日周四 上午11:57写道:
>
>> dear all :
>> 用table api 提交多个SQL 的时候,有什么API 能提前检查SQL 的错误吗? 每次都是提交执行的时候才报错。
>> 理论上里面在SQL解析部分就能发现,有现成的API吗?还是说需要自己去抓那部分解析代码,然后封装出API。
>> 如果没有,希望能提供这个功能,blink 应该是有的。
>>
>>
>>
>>
>> Thanks !
>>


Re: Fwd: flink1.12.2 CLI连接hive出现异常

2021-04-28 文章 张锴
我这里生产的hive没有配置Kerberos认证

张锴  于2021年4月29日周四 上午10:05写道:

> 官网有说吗,你在哪里找到的呢
>
> guoyb <861277...@qq.com> 于2021年4月28日周三 上午10:56写道:
>
>> 我的也有这种问题,没解决,kerberos认证的hive导致的。
>>
>>
>>
>> ---原始邮件---
>> 发件人: "张锴"> 发送时间: 2021年4月28日(周三) 上午10:41
>> 收件人: "user-zh"> 主题: Fwd: flink1.12.2 CLI连接hive出现异常
>>
>>
>> -- Forwarded message -
>> 发件人: 张锴 > Date: 2021年4月27日周二 下午1:59
>> Subject: flink1.12.2 CLI连接hive出现异常
>> To: >
>>
>> *使用flink1.12.2 CLI连接hive时,可以查看到catalog,也能看到指定的catalog下的tables,但是执行select
>> 语句时就出现异常。*
>> [ERROR] Could not execute SQL statement. Reason:
>> org.apache.hadoop.ipc.RemoteException: Application with id
>> 'application_1605840182730_29292' doesn't exist in RM. Please check that
>> the job submission was suc
>> at
>>
>> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:382)
>> at
>>
>> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:
>> at
>>
>> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:561)
>> at
>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
>> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
>> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> at
>>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>>
>> *使用yarn logs -applicationId  application_1605840182730_29292
>> 查看日志时,并没有给出具体的错误,以下是打出的日志。日志几乎看不出啥问题。*
>> INFO client.RMProxy: Connecting to ResourceManager at
>> hadoop01.xxx.xxx.xxx/xx.xx.x.xx:8050
>> Unable to get ApplicationState. Attempting to fetch logs directly from the
>> filesystem.
>> Can not find the appOwner. Please specify the correct appOwner
>> Could not locate application logs for application_1605840182730_29292
>>
>> 这个如何排查呢,有遇到类似的问题的小伙伴吗
>
>


Re: flink 背压问题

2021-04-28 文章 HunterXHunter
中间有错误数据或者其他错误原因,背压不会导致数据丢失



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Table-api sql 预检查

2021-04-28 文章 Shengkai Fang
Hi.

可以看看 TableEnvironment#execute逻辑,如果能够将sql 转化成 `Transformation`,那么语法应该没有问题。

Best,
Shengkai

Michael Ran  于2021年4月29日周四 上午11:57写道:

> dear all :
> 用table api 提交多个SQL 的时候,有什么API 能提前检查SQL 的错误吗? 每次都是提交执行的时候才报错。
> 理论上里面在SQL解析部分就能发现,有现成的API吗?还是说需要自己去抓那部分解析代码,然后封装出API。
> 如果没有,希望能提供这个功能,blink 应该是有的。
>
>
>
>
> Thanks !
>


Re: 使用Table&SQL API怎么构造多个sink

2021-04-28 文章 Shengkai Fang
Hi.

可以通过`StatementSet` 指定多个insert,这样子就可以构造出多个sink了。

Best,
Shengkai

Han Han1 Yue  于2021年4月28日周三 下午2:30写道:

> Hi,
> 个人在分析RelNodeBlock逻辑,多个SINK才会拆分并重用公共子树,怎么构造多个sink呢,
> 文件RelNodeBlock.scala源码里的writeToSink()已经找不到了
>
> // 源码里的多sink例子
> val sourceTable = tEnv.scan("test_table").select('a, 'b, 'c)
> val leftTable = sourceTable.filter('a > 0).select('a as 'a1, 'b as 'b1)
> val rightTable = sourceTable.filter('c.isNotNull).select('b as 'b2, 'c as
> 'c2)
> val joinTable = leftTable.join(rightTable, 'a1 === 'b2)
> joinTable.where('a1 >= 70).select('a1, 'b1).writeToSink(sink1)
> joinTable.where('a1 < 70 ).select('a1, 'c2).writeToSink(sink2)
>
> 谢谢
>


回复:wd: flink1.12.2 CLI连接hive出现异常

2021-04-28 文章 田向阳
重新启动一个yarn session 集群。


| |
田向阳
|
|
邮箱:lucas_...@163.com
|

签名由 网易邮箱大师 定制

在2021年04月29日 10:05,张锴 写道:
官网有说吗,你在哪里找到的呢

guoyb <861277...@qq.com> 于2021年4月28日周三 上午10:56写道:

> 我的也有这种问题,没解决,kerberos认证的hive导致的。
>
>
>
> ---原始邮件---
> 发件人: "张锴" 发送时间: 2021年4月28日(周三) 上午10:41
> 收件人: "user-zh" 主题: Fwd: flink1.12.2 CLI连接hive出现异常
>
>
> -- Forwarded message -
> 发件人: 张锴  Date: 2021年4月27日周二 下午1:59
> Subject: flink1.12.2 CLI连接hive出现异常
> To: 
>
> *使用flink1.12.2 CLI连接hive时,可以查看到catalog,也能看到指定的catalog下的tables,但是执行select
> 语句时就出现异常。*
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.hadoop.ipc.RemoteException: Application with id
> 'application_1605840182730_29292' doesn't exist in RM. Please check that
> the job submission was suc
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:382)
> at
>
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:
> at
>
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:561)
> at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>
> *使用yarn logs -applicationId  application_1605840182730_29292
> 查看日志时,并没有给出具体的错误,以下是打出的日志。日志几乎看不出啥问题。*
> INFO client.RMProxy: Connecting to ResourceManager at
> hadoop01.xxx.xxx.xxx/xx.xx.x.xx:8050
> Unable to get ApplicationState. Attempting to fetch logs directly from the
> filesystem.
> Can not find the appOwner. Please specify the correct appOwner
> Could not locate application logs for application_1605840182730_29292
>
> 这个如何排查呢,有遇到类似的问题的小伙伴吗


Table-api sql 预检查

2021-04-28 文章 Michael Ran
dear all :
用table api 提交多个SQL 的时候,有什么API 能提前检查SQL 的错误吗? 每次都是提交执行的时候才报错。
理论上里面在SQL解析部分就能发现,有现成的API吗?还是说需要自己去抓那部分解析代码,然后封装出API。
如果没有,希望能提供这个功能,blink 应该是有的。




Thanks !


Re:flink 背压问题

2021-04-28 文章 Michael Ran
不至于吧,中间有错误吧。。
在 2021-04-29 11:45:17,"Bruce Zhang"  写道:
>我的数据源每一秒发送一条数据,下游算子每六秒才能处理完成入库,我测试时使用的是一个并行度,数据发送完毕后,在库里只有前三条发送和后两条发送的数据,中间的数据全部丢失了。应该是背压机制的问题,这是什么原因呢


flink 背压问题

2021-04-28 文章 Bruce Zhang
我的数据源每一秒发送一条数据,下游算子每六秒才能处理完成入库,我测试时使用的是一个并行度,数据发送完毕后,在库里只有前三条发送和后两条发送的数据,中间的数据全部丢失了。应该是背压机制的问题,这是什么原因呢


Re: Fwd: flink1.12.2 CLI连接hive出现异常

2021-04-28 文章 张锴
官网有说吗,你在哪里找到的呢

guoyb <861277...@qq.com> 于2021年4月28日周三 上午10:56写道:

> 我的也有这种问题,没解决,kerberos认证的hive导致的。
>
>
>
> ---原始邮件---
> 发件人: "张锴" 发送时间: 2021年4月28日(周三) 上午10:41
> 收件人: "user-zh" 主题: Fwd: flink1.12.2 CLI连接hive出现异常
>
>
> -- Forwarded message -
> 发件人: 张锴  Date: 2021年4月27日周二 下午1:59
> Subject: flink1.12.2 CLI连接hive出现异常
> To: 
>
> *使用flink1.12.2 CLI连接hive时,可以查看到catalog,也能看到指定的catalog下的tables,但是执行select
> 语句时就出现异常。*
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.hadoop.ipc.RemoteException: Application with id
> 'application_1605840182730_29292' doesn't exist in RM. Please check that
> the job submission was suc
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:382)
> at
>
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:
> at
>
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:561)
> at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>
> *使用yarn logs -applicationId  application_1605840182730_29292
> 查看日志时,并没有给出具体的错误,以下是打出的日志。日志几乎看不出啥问题。*
> INFO client.RMProxy: Connecting to ResourceManager at
> hadoop01.xxx.xxx.xxx/xx.xx.x.xx:8050
> Unable to get ApplicationState. Attempting to fetch logs directly from the
> filesystem.
> Can not find the appOwner. Please specify the correct appOwner
> Could not locate application logs for application_1605840182730_29292
>
> 这个如何排查呢,有遇到类似的问题的小伙伴吗


Re: 流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB

2021-04-28 文章 Peihui He
刚觉像是rocksdb的内存不够用了,调大试试呢?

a593700624 <593700...@qq.com> 于2021年4月28日周三 下午3:47写道:

> org.apache.flink.util.FlinkRuntimeException: Error while retrieving data
> from
> RocksDB
> at
>
> org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121)
> at
>
> org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111)
> at
>
> org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:60)
> at
>
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:455)
> at
>
> org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276)
> at
>
> org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:154)
> at
>
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:568)
> at
>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:167)
> at
>
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:179)
> at
>
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:101)
> at
> org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:180)
> at
> org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> at
> org.apache.flink.streaming.runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> at
>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> at
>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.rocksdb.RocksDBException: Requested array size exceeds VM
> limit
> at org.rocksdb.RocksDB.get(Native Method)
> at org.rocksdb.RocksDB.get(RocksDB.java:810)
> at
>
> org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:118)
> ... 20 more
>
>
> 能跑几个小时,总会因为这个问题,一直陷入重启
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
>


回复:flink 1.12.1 savepoint 重启问题

2021-04-28 文章 田向阳
我也遇到了,不知道啥原因,这个也是偶尔发生,是真的难定位问题


| |
田向阳
|
|
邮箱:lucas_...@163.com
|

签名由 网易邮箱大师 定制

在2021年04月28日 17:00,chaos 写道:
你好,因为业务逻辑变化,需要对线上在跑的flink任务重启,采用savapoint方式。

操作如下:
# 查看yarn-application-id
yarn application -list
# 查看flink任务id
flink list -t yarn-per-job
-Dyarn.application.id=application_1617064018715_0097
# 带有保存点的停止任务
flink stop -t yarn-per-job -p hdfs:///user/chaos/flink/0097_savepoint/
-Dyarn.application.id=application_1617064018715_0097
d7c1e0231f0e72cbd709baa9a0ba6415
Savepoint completed. Path:
hdfs://nameservice1/user/chaos/flink/0097_savepoint/savepoint-d7c1e0-96f5b07d18be
# 从保存点中重启任务
flink run -t yarn-per-job -s
hdfs://nameservice1/user/chaos/flink/0097_savepoint/savepoint-d7c1e0-96f5b07d18be
-d -c xxx xxx.jar

遇到的问题:
Service temporarily unavailable due to an ongoing leader election. Please
refresh

还请解惑,提前谢过!



--
Sent from: http://apache-flink.147419.n8.nabble.com/


flink 1.12.1 savepoint 重启问题

2021-04-28 文章 chaos
你好,因为业务逻辑变化,需要对线上在跑的flink任务重启,采用savapoint方式。

操作如下:
# 查看yarn-application-id
yarn application -list
# 查看flink任务id
flink list -t yarn-per-job
-Dyarn.application.id=application_1617064018715_0097
# 带有保存点的停止任务
flink stop -t yarn-per-job -p hdfs:///user/chaos/flink/0097_savepoint/
-Dyarn.application.id=application_1617064018715_0097
d7c1e0231f0e72cbd709baa9a0ba6415
Savepoint completed. Path:
hdfs://nameservice1/user/chaos/flink/0097_savepoint/savepoint-d7c1e0-96f5b07d18be
# 从保存点中重启任务
flink run -t yarn-per-job -s
hdfs://nameservice1/user/chaos/flink/0097_savepoint/savepoint-d7c1e0-96f5b07d18be
-d -c xxx xxx.jar

遇到的问题:
Service temporarily unavailable due to an ongoing leader election. Please
refresh

还请解惑,提前谢过!



--
Sent from: http://apache-flink.147419.n8.nabble.com/


flink sql 1.12 minibatch??????

2021-04-28 文章 op
    flink sql 1.12 minibatch??
val config = tConfig.getConfiguration()
config.setString("table.exec.mini-batch.enabled", "true") //  mini-batch is 
enabled
config.setString("table.exec.mini-batch.allow-latency", "true") 
config.setString("table.exec.mini-batch.size", 100) 
config.setString("table.optimizer.agg-phase-strategy", "TWO_PHASE") // enable 
two-phase, i.e. local-global aggregation
config.setString("table.optimizer.distinct-agg.split.enabled", "true") //not 
support user defined AggregateFunctionsql??tableEnv.executeSql(
  s"""insert into event_test
 |select
 |  date_format(create_time,'MMdd') dt,
 |  uid,
 |  count(distinct fid) text_feed_count,
 |  max(event_time) event_time
 |from basic_fd_base where ftype <>'0' and uid is not null
 |  group by
 |   date_format(create_time,'MMdd'),
 |   uid
""".stripMargin).print()event_test print connectorbasic_fd_base kafka 
connectortimestampeventtimenullnull??2>
 +I(20210427,747873111868904192,1,2021-04-27T14:03:10)
2> +I(20210427,709531067945685120,1,null)
2> +I(20210427,759213633292150016,1,2021-04-27T13:59:01.923)
2> +I(20210427,758340406550406272,4,2021-04-27T14:02:14.553)
2> +I(20210427,759658063329437312,1,2021-04-27T14:02:18.305)
2> +I(20210427,737415823706231680,1,2021-04-27T14:02:11.061)
2> +I(20210427,xishuashu...@sohu.com,1,2021-04-27T14:05:37)
2> +I(20210427,759219266892539008,1,null)
2> +I(20210427,758349976605763328,1,2021-04-27T14:02:24.184)
2> -U(20210427,709531067945685120,1,null)
2> +U(20210427,709531067945685120,1,2021-04-27T14:09:27.156)
2> +I(20210427,751664239562922752,1,2021-04-27T14:16:51.133)
2> -U(20210427,759219266892539008,1,null)
2> +U(20210427,759219266892539008,1,2021-04-27T14:12:40.692)
2> +I(20210427,745540385069273984,1,2021-04-27T14:23:34)
2> +I(20210427,745399833011098240,1,2021-04-27T14:20:32.870)
2> +I(20210427,714590484395398016,1,2021-04-27T14:19:06)
2> +I(20210427,747859173236216832,1,2021-04-27T14:28:21.864)
2> +I(20210427,746212052309316608,1,null)
2> +I(20210427,666839205279797376,1,2021-04-27T14:26:36.743)
2> +I(20210427,758334362541565568,3,2021-04-27T14:18:58.396)
2> +I(20210427,758325137706788480,1,2021-04-27T14:01:09.053)
2> +I(20210427,747837209624908800,1,2021-04-27T13:59:44.193)
2> -U(20210427,758388594254750720,1,2021-04-27T14:00:44.212)
2> +U(20210427,758388594254750720,4,2021-04-27T14:14:55)
2> +I(20210427,75946621079296,1,2021-04-27T14:25:59.019)
2> -U(20210427,762769243539450496,1,2021-04-27T14:04:29)
2> +U(20210427,762769243539450496,2,2021-04-27T14:04:29)
2> +I(20210427,720648040456852096,1,2021-04-27T14:19:38.680)
2> +I(20210427,750144041584368000,1,2021-04-27T14:29:25.621)
2> +I(20210427,713108045701517952,1,null)
??minibatchnull??

流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB

2021-04-28 文章 a593700624
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from
RocksDB
at
org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121)
at
org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111)
at
org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:60)
at
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:455)
at
org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276)
at
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:154)
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:568)
at
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:167)
at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:179)
at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:101)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:180)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
at
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.rocksdb.RocksDBException: Requested array size exceeds VM
limit
at org.rocksdb.RocksDB.get(Native Method)
at org.rocksdb.RocksDB.get(RocksDB.java:810)
at
org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:118)
... 20 more


能跑几个小时,总会因为这个问题,一直陷入重启



--
Sent from: http://apache-flink.147419.n8.nabble.com/


流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB

2021-04-28 文章 a593700624
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from
RocksDB
at
org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121)
at
org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111)
at
org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:60)
at
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:455)
at
org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276)
at
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:154)
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:568)
at
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:167)
at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:179)
at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:101)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:180)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
at
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.rocksdb.RocksDBException: Requested array size exceeds VM
limit
at org.rocksdb.RocksDB.get(Native Method)
at org.rocksdb.RocksDB.get(RocksDB.java:810)
at
org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:118)
... 20 more


能跑几个小时,总会因为这个问题,一直陷入重启



--
Sent from: http://apache-flink.147419.n8.nabble.com/


org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from RocksDB

2021-04-28 文章 a593700624
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from
RocksDB
at
org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121)
at
org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111)
at
org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:60)
at
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:455)
at
org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276)
at
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:154)
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:568)
at
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:167)
at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:179)
at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:101)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:180)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
at
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.rocksdb.RocksDBException: Requested array size exceeds VM
limit
at org.rocksdb.RocksDB.get(Native Method)
at org.rocksdb.RocksDB.get(RocksDB.java:810)
at
org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:118)
... 20 more


我的流跑着跑着就会报出这样的问题,看问题是Requested array size exceeds VM
limit,应该是需要调整rocksdb的参数,这个是为什么呢



--
Sent from: http://apache-flink.147419.n8.nabble.com/