[
https://issues.apache.org/jira/browse/FLINK-22086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-22086:
-----------------------------------
Labels: auto-deprioritized-major stale-minor (was:
auto-deprioritized-major)
I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help
the community manage its development. I see this issues has been marked as
Minor but is unassigned and neither itself nor its Sub-Tasks have been updated
for 180 days. I have gone ahead and marked it "stale-minor". If this ticket is
still Minor, please either assign yourself or give an update. Afterwards,
please remove the label or in 7 days the issue will be deprioritized.
> Does it support rate-limiting on reading hive in flink 1.12?Read hive source
> lead to high loadaverage
> -----------------------------------------------------------------------------------------------------
>
> Key: FLINK-22086
> URL: https://issues.apache.org/jira/browse/FLINK-22086
> Project: Flink
> Issue Type: New Feature
> Components: Connectors / Hive
> Affects Versions: 1.12.0
> Reporter: ying zhang
> Priority: Minor
> Labels: auto-deprioritized-major, stale-minor
> Attachments: image-2021-04-01-15-56-50-736.png,
> image-2021-04-01-15-58-18-265.png, image-2021-04-01-16-07-28-469.png,
> image-2021-04-01-16-08-28-442.png
>
>
> I read hive source with flink sql batch,but I found a Exception like this:
> org.apache.flink.runtime.JobException: Recovery is suppressed by
> NoRestartBackoffTimeStrategy
> at
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:118)
> at
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:80)
> at
> org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:239)
> at
> org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:230)
> at
> org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:221)
> at
> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:672)
> at
> org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:90)
> at
> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:453)
> at sun.reflect.GeneratedMethodAccessor312.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:306)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:213)
> at
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:159)
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by:
> org.apache.flink.runtime.io.network.netty.exception.LocalTransportException:
> readAddress(..) failed: Connection reset by peer (connection to
> '11.69.21.53/11.69.21.53:19977')
> at
> org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.exceptionCaught(CreditBasedPartitionRequestClientHandler.java:201)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1377)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:907)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.handleReadException(AbstractEpollStreamChannel.java:728)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:818)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475)
> at
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
> at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
> at
> org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> org.apache.flink.shaded.netty4.io.netty.channel.unix.Errors$NativeIoException:
> readAddress(..) failed: Connection reset by peer
>
>
>
> then, I watch the monitor of the machine:
> !image-2021-04-01-16-07-28-469.png!
>
>
>
>
> I have the jstack log:
> !image-2021-04-01-15-58-18-265.png!
>
>
> "Source Data Fetcher for Source:
> HiveSource-app.app_sdl_yinliu_search_query_log (164/250)#0" Id=815
> cpuUsage=99.96% deltaTime=201ms time=43270ms RUNNABLE
> at sun.nio.cs.UTF_8$Decoder.decodeLoop(UTF_8.java:412)
> at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:579)
> at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:802)
> at org.apache.hadoop.io.Text.decode(Text.java:412)
> at org.apache.hadoop.io.Text.decode(Text.java:389)
> at org.apache.hadoop.io.Text.toString(Text.java:280)
> at
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:46)
> at
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:26)
> at
> org.apache.flink.table.functions.hive.conversion.HiveInspectors.toFlinkObject(HiveInspectors.java:291)
> at
> org.apache.flink.table.functions.hive.conversion.HiveInspectors.toFlinkObject(HiveInspectors.java:338)
> at
> org.apache.flink.connectors.hive.read.HiveMapredSplitReader.nextRecord(HiveMapredSplitReader.java:180)
> at
> org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.nextRecord(HiveBulkFormatAdapter.java:336)
> at
> org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.readBatch(HiveBulkFormatAdapter.java:319)
> at
> org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:67)
> at
> org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:56)
> at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:138)
> at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:101)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> "Source Data Fetcher for Source:
> HiveSource-app.app_sdl_yinliu_search_query_log (165/250)#0" Id=817
> cpuUsage=99.96% deltaTime=201ms time=44024ms RUNNABLE
> at org.apache.hadoop.io.Text.append(Text.java:236)
> at
> org.apache.hadoop.hive.ql.io.orc.DynamicByteArray.setText(DynamicByteArray.java:212)
> at
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringDictionaryTreeReader.next(TreeReaderFactory.java:1724)
> at
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringTreeReader.next(TreeReaderFactory.java:1397)
> at
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$MapTreeReader.next(TreeReaderFactory.java:2274)
> at
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1046)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:166)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:140)
> at
> org.apache.flink.connectors.hive.read.HiveMapredSplitReader.reachedEnd(HiveMapredSplitReader.java:160)
> at
> org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.readBatch(HiveBulkFormatAdapter.java:318)
> at
> org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:67)
> at
> org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:56)
> at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:138)
> at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:101)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> "Source Data Fetcher for Source:
> HiveSource-app.app_sdl_yinliu_search_query_log (166/250)#0" Id=816
> cpuUsage=99.96% deltaTime=201ms time=43490ms RUNNABLE
> at
> org.apache.flink.table.data.util.DataFormatConverters$MapConverter.toBinaryMap(DataFormatConverters.java:1279)
> at
> org.apache.flink.table.data.util.DataFormatConverters$MapConverter.toInternalImpl(DataFormatConverters.java:1245)
> at
> org.apache.flink.table.data.util.DataFormatConverters$MapConverter.toInternalImpl(DataFormatConverters.java:1196)
> at
> org.apache.flink.table.data.util.DataFormatConverters$DataFormatConverter.toInternal(DataFormatConverters.java:406)
> at
> org.apache.flink.connectors.hive.read.HiveMapredSplitReader.nextRecord(HiveMapredSplitReader.java:185)
> at
> org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.nextRecord(HiveBulkFormatAdapter.java:336)
> at
> org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.readBatch(HiveBulkFormatAdapter.java:319)
> at
> org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:67)
> at
> org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:56)
> at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:138)
> at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:101)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
> 1、I set 10 core and 12 GB one TaskManage,
> 2、it looks like only 3 cores is full,but loadaverage goes up to 100,and
> adding cores doesnot work
> 3、I run 'netstat -anp | wc -l', result is 18
> 4、 !image-2021-04-01-16-08-28-442.png!
>
>
>
>
> my hive storage config:
> # Storage Information# Storage InformationSerDe Library:
> org.apache.hadoop.hive.ql.io.orc.OrcSerdeInputFormat:
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormatOutputFormat:
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormatCompressed: NoNum
> Buckets: -1Bucket Columns: []Sort Columns: []
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)