Re:Re: Table-api sql 预检查
从代码逻辑里面肯定能抓出来,就是觉得这个预检查的功能可以作为API 开放出来 在 2021-04-29 12:29:26,"Shengkai Fang" 写道: >Hi. > >可以看看 TableEnvironment#execute逻辑,如果能够将sql 转化成 `Transformation`,那么语法应该没有问题。 > >Best, >Shengkai > >Michael Ran 于2021年4月29日周四 上午11:57写道: > >> dear all : >> 用table api 提交多个SQL 的时候,有什么API 能提前检查SQL 的错误吗? 每次都是提交执行的时候才报错。 >> 理论上里面在SQL解析部分就能发现,有现成的API吗?还是说需要自己去抓那部分解析代码,然后封装出API。 >> 如果没有,希望能提供这个功能,blink 应该是有的。 >> >> >> >> >> Thanks ! >>
Re: Fwd: flink1.12.2 CLI连接hive出现异常
我这里生产的hive没有配置Kerberos认证 张锴 于2021年4月29日周四 上午10:05写道: > 官网有说吗,你在哪里找到的呢 > > guoyb <861277...@qq.com> 于2021年4月28日周三 上午10:56写道: > >> 我的也有这种问题,没解决,kerberos认证的hive导致的。 >> >> >> >> ---原始邮件--- >> 发件人: "张锴"> 发送时间: 2021年4月28日(周三) 上午10:41 >> 收件人: "user-zh"> 主题: Fwd: flink1.12.2 CLI连接hive出现异常 >> >> >> -- Forwarded message - >> 发件人: 张锴 > Date: 2021年4月27日周二 下午1:59 >> Subject: flink1.12.2 CLI连接hive出现异常 >> To: > >> >> *使用flink1.12.2 CLI连接hive时,可以查看到catalog,也能看到指定的catalog下的tables,但是执行select >> 语句时就出现异常。* >> [ERROR] Could not execute SQL statement. Reason: >> org.apache.hadoop.ipc.RemoteException: Application with id >> 'application_1605840182730_29292' doesn't exist in RM. Please check that >> the job submission was suc >> at >> >> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:382) >> at >> >> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java: >> at >> >> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:561) >> at >> >> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) >> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) >> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:422) >> at >> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) >> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) >> >> *使用yarn logs -applicationId application_1605840182730_29292 >> 查看日志时,并没有给出具体的错误,以下是打出的日志。日志几乎看不出啥问题。* >> INFO client.RMProxy: Connecting to ResourceManager at >> hadoop01.xxx.xxx.xxx/xx.xx.x.xx:8050 >> Unable to get ApplicationState. Attempting to fetch logs directly from the >> filesystem. >> Can not find the appOwner. Please specify the correct appOwner >> Could not locate application logs for application_1605840182730_29292 >> >> 这个如何排查呢,有遇到类似的问题的小伙伴吗 > >
Re: flink 背压问题
中间有错误数据或者其他错误原因,背压不会导致数据丢失 -- Sent from: http://apache-flink.147419.n8.nabble.com/
Re: Table-api sql 预检查
Hi. 可以看看 TableEnvironment#execute逻辑,如果能够将sql 转化成 `Transformation`,那么语法应该没有问题。 Best, Shengkai Michael Ran 于2021年4月29日周四 上午11:57写道: > dear all : > 用table api 提交多个SQL 的时候,有什么API 能提前检查SQL 的错误吗? 每次都是提交执行的时候才报错。 > 理论上里面在SQL解析部分就能发现,有现成的API吗?还是说需要自己去抓那部分解析代码,然后封装出API。 > 如果没有,希望能提供这个功能,blink 应该是有的。 > > > > > Thanks ! >
Re: 使用Table&SQL API怎么构造多个sink
Hi. 可以通过`StatementSet` 指定多个insert,这样子就可以构造出多个sink了。 Best, Shengkai Han Han1 Yue 于2021年4月28日周三 下午2:30写道: > Hi, > 个人在分析RelNodeBlock逻辑,多个SINK才会拆分并重用公共子树,怎么构造多个sink呢, > 文件RelNodeBlock.scala源码里的writeToSink()已经找不到了 > > // 源码里的多sink例子 > val sourceTable = tEnv.scan("test_table").select('a, 'b, 'c) > val leftTable = sourceTable.filter('a > 0).select('a as 'a1, 'b as 'b1) > val rightTable = sourceTable.filter('c.isNotNull).select('b as 'b2, 'c as > 'c2) > val joinTable = leftTable.join(rightTable, 'a1 === 'b2) > joinTable.where('a1 >= 70).select('a1, 'b1).writeToSink(sink1) > joinTable.where('a1 < 70 ).select('a1, 'c2).writeToSink(sink2) > > 谢谢 >
回复:wd: flink1.12.2 CLI连接hive出现异常
重新启动一个yarn session 集群。 | | 田向阳 | | 邮箱:lucas_...@163.com | 签名由 网易邮箱大师 定制 在2021年04月29日 10:05,张锴 写道: 官网有说吗,你在哪里找到的呢 guoyb <861277...@qq.com> 于2021年4月28日周三 上午10:56写道: > 我的也有这种问题,没解决,kerberos认证的hive导致的。 > > > > ---原始邮件--- > 发件人: "张锴" 发送时间: 2021年4月28日(周三) 上午10:41 > 收件人: "user-zh" 主题: Fwd: flink1.12.2 CLI连接hive出现异常 > > > -- Forwarded message - > 发件人: 张锴 Date: 2021年4月27日周二 下午1:59 > Subject: flink1.12.2 CLI连接hive出现异常 > To: > > *使用flink1.12.2 CLI连接hive时,可以查看到catalog,也能看到指定的catalog下的tables,但是执行select > 语句时就出现异常。* > [ERROR] Could not execute SQL statement. Reason: > org.apache.hadoop.ipc.RemoteException: Application with id > 'application_1605840182730_29292' doesn't exist in RM. Please check that > the job submission was suc > at > > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:382) > at > > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java: > at > > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:561) > at > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > > *使用yarn logs -applicationId application_1605840182730_29292 > 查看日志时,并没有给出具体的错误,以下是打出的日志。日志几乎看不出啥问题。* > INFO client.RMProxy: Connecting to ResourceManager at > hadoop01.xxx.xxx.xxx/xx.xx.x.xx:8050 > Unable to get ApplicationState. Attempting to fetch logs directly from the > filesystem. > Can not find the appOwner. Please specify the correct appOwner > Could not locate application logs for application_1605840182730_29292 > > 这个如何排查呢,有遇到类似的问题的小伙伴吗
Table-api sql 预检查
dear all : 用table api 提交多个SQL 的时候,有什么API 能提前检查SQL 的错误吗? 每次都是提交执行的时候才报错。 理论上里面在SQL解析部分就能发现,有现成的API吗?还是说需要自己去抓那部分解析代码,然后封装出API。 如果没有,希望能提供这个功能,blink 应该是有的。 Thanks !
Re:flink 背压问题
不至于吧,中间有错误吧。。 在 2021-04-29 11:45:17,"Bruce Zhang" 写道: >我的数据源每一秒发送一条数据,下游算子每六秒才能处理完成入库,我测试时使用的是一个并行度,数据发送完毕后,在库里只有前三条发送和后两条发送的数据,中间的数据全部丢失了。应该是背压机制的问题,这是什么原因呢
flink 背压问题
我的数据源每一秒发送一条数据,下游算子每六秒才能处理完成入库,我测试时使用的是一个并行度,数据发送完毕后,在库里只有前三条发送和后两条发送的数据,中间的数据全部丢失了。应该是背压机制的问题,这是什么原因呢
Re: Fwd: flink1.12.2 CLI连接hive出现异常
官网有说吗,你在哪里找到的呢 guoyb <861277...@qq.com> 于2021年4月28日周三 上午10:56写道: > 我的也有这种问题,没解决,kerberos认证的hive导致的。 > > > > ---原始邮件--- > 发件人: "张锴" 发送时间: 2021年4月28日(周三) 上午10:41 > 收件人: "user-zh" 主题: Fwd: flink1.12.2 CLI连接hive出现异常 > > > -- Forwarded message - > 发件人: 张锴 Date: 2021年4月27日周二 下午1:59 > Subject: flink1.12.2 CLI连接hive出现异常 > To: > > *使用flink1.12.2 CLI连接hive时,可以查看到catalog,也能看到指定的catalog下的tables,但是执行select > 语句时就出现异常。* > [ERROR] Could not execute SQL statement. Reason: > org.apache.hadoop.ipc.RemoteException: Application with id > 'application_1605840182730_29292' doesn't exist in RM. Please check that > the job submission was suc > at > > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:382) > at > > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java: > at > > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:561) > at > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > > *使用yarn logs -applicationId application_1605840182730_29292 > 查看日志时,并没有给出具体的错误,以下是打出的日志。日志几乎看不出啥问题。* > INFO client.RMProxy: Connecting to ResourceManager at > hadoop01.xxx.xxx.xxx/xx.xx.x.xx:8050 > Unable to get ApplicationState. Attempting to fetch logs directly from the > filesystem. > Can not find the appOwner. Please specify the correct appOwner > Could not locate application logs for application_1605840182730_29292 > > 这个如何排查呢,有遇到类似的问题的小伙伴吗
Re: 流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB
刚觉像是rocksdb的内存不够用了,调大试试呢? a593700624 <593700...@qq.com> 于2021年4月28日周三 下午3:47写道: > org.apache.flink.util.FlinkRuntimeException: Error while retrieving data > from > RocksDB > at > > org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121) > at > > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111) > at > > org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:60) > at > > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:455) > at > > org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276) > at > > org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:154) > at > > org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:568) > at > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:167) > at > > org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:179) > at > > org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:101) > at > org.apache.flink.streaming.runtime.io > .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:180) > at > org.apache.flink.streaming.runtime.io > .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) > at > org.apache.flink.streaming.runtime.io > .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) > at > > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) > at > > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) > at > > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) > at > > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.rocksdb.RocksDBException: Requested array size exceeds VM > limit > at org.rocksdb.RocksDB.get(Native Method) > at org.rocksdb.RocksDB.get(RocksDB.java:810) > at > > org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:118) > ... 20 more > > > 能跑几个小时,总会因为这个问题,一直陷入重启 > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ >
回复:flink 1.12.1 savepoint 重启问题
我也遇到了,不知道啥原因,这个也是偶尔发生,是真的难定位问题 | | 田向阳 | | 邮箱:lucas_...@163.com | 签名由 网易邮箱大师 定制 在2021年04月28日 17:00,chaos 写道: 你好,因为业务逻辑变化,需要对线上在跑的flink任务重启,采用savapoint方式。 操作如下: # 查看yarn-application-id yarn application -list # 查看flink任务id flink list -t yarn-per-job -Dyarn.application.id=application_1617064018715_0097 # 带有保存点的停止任务 flink stop -t yarn-per-job -p hdfs:///user/chaos/flink/0097_savepoint/ -Dyarn.application.id=application_1617064018715_0097 d7c1e0231f0e72cbd709baa9a0ba6415 Savepoint completed. Path: hdfs://nameservice1/user/chaos/flink/0097_savepoint/savepoint-d7c1e0-96f5b07d18be # 从保存点中重启任务 flink run -t yarn-per-job -s hdfs://nameservice1/user/chaos/flink/0097_savepoint/savepoint-d7c1e0-96f5b07d18be -d -c xxx xxx.jar 遇到的问题: Service temporarily unavailable due to an ongoing leader election. Please refresh 还请解惑,提前谢过! -- Sent from: http://apache-flink.147419.n8.nabble.com/
flink 1.12.1 savepoint 重启问题
你好,因为业务逻辑变化,需要对线上在跑的flink任务重启,采用savapoint方式。 操作如下: # 查看yarn-application-id yarn application -list # 查看flink任务id flink list -t yarn-per-job -Dyarn.application.id=application_1617064018715_0097 # 带有保存点的停止任务 flink stop -t yarn-per-job -p hdfs:///user/chaos/flink/0097_savepoint/ -Dyarn.application.id=application_1617064018715_0097 d7c1e0231f0e72cbd709baa9a0ba6415 Savepoint completed. Path: hdfs://nameservice1/user/chaos/flink/0097_savepoint/savepoint-d7c1e0-96f5b07d18be # 从保存点中重启任务 flink run -t yarn-per-job -s hdfs://nameservice1/user/chaos/flink/0097_savepoint/savepoint-d7c1e0-96f5b07d18be -d -c xxx xxx.jar 遇到的问题: Service temporarily unavailable due to an ongoing leader election. Please refresh 还请解惑,提前谢过! -- Sent from: http://apache-flink.147419.n8.nabble.com/
flink sql 1.12 minibatch??????
flink sql 1.12 minibatch?? val config = tConfig.getConfiguration() config.setString("table.exec.mini-batch.enabled", "true") // mini-batch is enabled config.setString("table.exec.mini-batch.allow-latency", "true") config.setString("table.exec.mini-batch.size", 100) config.setString("table.optimizer.agg-phase-strategy", "TWO_PHASE") // enable two-phase, i.e. local-global aggregation config.setString("table.optimizer.distinct-agg.split.enabled", "true") //not support user defined AggregateFunctionsql??tableEnv.executeSql( s"""insert into event_test |select | date_format(create_time,'MMdd') dt, | uid, | count(distinct fid) text_feed_count, | max(event_time) event_time |from basic_fd_base where ftype <>'0' and uid is not null | group by | date_format(create_time,'MMdd'), | uid """.stripMargin).print()event_test print connectorbasic_fd_base kafka connectortimestampeventtimenullnull??2> +I(20210427,747873111868904192,1,2021-04-27T14:03:10) 2> +I(20210427,709531067945685120,1,null) 2> +I(20210427,759213633292150016,1,2021-04-27T13:59:01.923) 2> +I(20210427,758340406550406272,4,2021-04-27T14:02:14.553) 2> +I(20210427,759658063329437312,1,2021-04-27T14:02:18.305) 2> +I(20210427,737415823706231680,1,2021-04-27T14:02:11.061) 2> +I(20210427,xishuashu...@sohu.com,1,2021-04-27T14:05:37) 2> +I(20210427,759219266892539008,1,null) 2> +I(20210427,758349976605763328,1,2021-04-27T14:02:24.184) 2> -U(20210427,709531067945685120,1,null) 2> +U(20210427,709531067945685120,1,2021-04-27T14:09:27.156) 2> +I(20210427,751664239562922752,1,2021-04-27T14:16:51.133) 2> -U(20210427,759219266892539008,1,null) 2> +U(20210427,759219266892539008,1,2021-04-27T14:12:40.692) 2> +I(20210427,745540385069273984,1,2021-04-27T14:23:34) 2> +I(20210427,745399833011098240,1,2021-04-27T14:20:32.870) 2> +I(20210427,714590484395398016,1,2021-04-27T14:19:06) 2> +I(20210427,747859173236216832,1,2021-04-27T14:28:21.864) 2> +I(20210427,746212052309316608,1,null) 2> +I(20210427,666839205279797376,1,2021-04-27T14:26:36.743) 2> +I(20210427,758334362541565568,3,2021-04-27T14:18:58.396) 2> +I(20210427,758325137706788480,1,2021-04-27T14:01:09.053) 2> +I(20210427,747837209624908800,1,2021-04-27T13:59:44.193) 2> -U(20210427,758388594254750720,1,2021-04-27T14:00:44.212) 2> +U(20210427,758388594254750720,4,2021-04-27T14:14:55) 2> +I(20210427,75946621079296,1,2021-04-27T14:25:59.019) 2> -U(20210427,762769243539450496,1,2021-04-27T14:04:29) 2> +U(20210427,762769243539450496,2,2021-04-27T14:04:29) 2> +I(20210427,720648040456852096,1,2021-04-27T14:19:38.680) 2> +I(20210427,750144041584368000,1,2021-04-27T14:29:25.621) 2> +I(20210427,713108045701517952,1,null) ??minibatchnull??
流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from RocksDB at org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121) at org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111) at org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:60) at org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:455) at org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276) at org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:154) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:568) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:167) at org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:179) at org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:101) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:180) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) at java.lang.Thread.run(Thread.java:748) Caused by: org.rocksdb.RocksDBException: Requested array size exceeds VM limit at org.rocksdb.RocksDB.get(Native Method) at org.rocksdb.RocksDB.get(RocksDB.java:810) at org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:118) ... 20 more 能跑几个小时,总会因为这个问题,一直陷入重启 -- Sent from: http://apache-flink.147419.n8.nabble.com/
流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from RocksDB at org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121) at org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111) at org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:60) at org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:455) at org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276) at org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:154) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:568) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:167) at org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:179) at org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:101) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:180) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) at java.lang.Thread.run(Thread.java:748) Caused by: org.rocksdb.RocksDBException: Requested array size exceeds VM limit at org.rocksdb.RocksDB.get(Native Method) at org.rocksdb.RocksDB.get(RocksDB.java:810) at org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:118) ... 20 more 能跑几个小时,总会因为这个问题,一直陷入重启 -- Sent from: http://apache-flink.147419.n8.nabble.com/
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from RocksDB
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from RocksDB at org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121) at org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111) at org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:60) at org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onEventTime(WindowOperator.java:455) at org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:276) at org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:154) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:568) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitWatermark(OneInputStreamTask.java:167) at org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:179) at org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:101) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:180) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) at java.lang.Thread.run(Thread.java:748) Caused by: org.rocksdb.RocksDBException: Requested array size exceeds VM limit at org.rocksdb.RocksDB.get(Native Method) at org.rocksdb.RocksDB.get(RocksDB.java:810) at org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:118) ... 20 more 我的流跑着跑着就会报出这样的问题,看问题是Requested array size exceeds VM limit,应该是需要调整rocksdb的参数,这个是为什么呢 -- Sent from: http://apache-flink.147419.n8.nabble.com/