devinbost opened a new issue #10298: URL: https://github.com/apache/pulsar/issues/10298
I got the error below in one of my functions after upgrading from 2.6.3 to a cut built from master (2.8.0-SNAPSHOT). To get the function to run, I needed to increase the function's RAM through the Admin API. When I captured a heap dump, the function had grown to 6.5 GB! I hadn't needed to increase its RAM in any previous version of Pulsar. In the heap dump, it appears that it's retaining far messages in memory than expected. If I randomly expand these  they appear to be mixed types. I tried computing dominators in Visual VM, but that didn't provide any useful info. Any tips for computing summary statistics on the objects would be helpful. > DEADLINE_EXCEEDED: deadline exceeded after 4.999815491s. [remote_addr=127.0.0.1/127.0.0.1:44422] > > 02:51:53.176 [myTenant/myNamespace/function1-0] INFO org.apache.pulsar.functions.instance.JavaInstanceRunnable - Closing instance > 02:51:52.787 [pulsar-client-io-1-5] WARN io.netty.channel.AbstractChannelHandlerContext - An exception 'java.lang.OutOfMemoryError: GC overhead limit exceeded' [enable DEBUG level for full stacktrace] was thrown by a user handler's exceptionCaught() method while handling the following exception: > io.netty.util.IllegalReferenceCountException: refCnt: 0, decrement: 1 > at io.netty.util.internal.ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:74) ~[java-instance.jar:?] > at io.netty.util.internal.ReferenceCountUpdater.release(ReferenceCountUpdater.java:138) ~[java-instance.jar:?] > at io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:100) ~[java-instance.jar:?] > at org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:427) ~[java-instance.jar:?] > at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [java-instance.jar:?] > at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [java-instance.jar:?] > at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [java-instance.jar:?] > at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [java-instance.jar:?] > at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [java-instance.jar:?] > at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [java-instance.jar:?] > at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [java-instance.jar:?] > at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [java-instance.jar:?] > at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [java-instance.jar:?] > at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [java-instance.jar:?] > at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [java-instance.jar:?] > at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [java-instance.jar:?] > at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) [java-instance.jar:?] > at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$1.run(AbstractEpollChannel.java:425) [java-instance.jar:?] > at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [java-instance.jar:?] > at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [java-instance.jar:?] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [java-instance.jar:?] > at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [java-instance.jar:?] > at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [java-instance.jar:?] > at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [java-instance.jar:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 02:51:53.957 [pulsar-client-io-1-5] WARN org.apache.pulsar.client.impl.ClientCnx - [fab08.umf.prod.ostk.com/10.20.69.37:6650] Got exception java.lang.OutOfMemoryError: GC overhead limit exceeded > > 02:51:54.039 [myTenant/myNamespace/function1-0] ERROR org.apache.pulsar.functions.instance.JavaInstanceRunnable - Failed to close source > java.lang.OutOfMemoryError: GC overhead limit exceeded > 02:51:54.677 [pulsar-client-io-1-5] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0xdaee469d, L:/10.20.69.28:38304 ! R:fab08.umf.prod.ostk.com/10.20.69.37:6650] Disconnected > 02:51:55.123 [myTenant/myNamespace/function1-0] ERROR org.apache.pulsar.functions.instance.JavaInstanceRunnable - Failed to close sink > java.lang.InternalError: linkToTargetMethod=Lambda(a0:L,a1:L,a2:L,a3:L)=>{ > t4:L=MethodHandle.invokeBasic(a3:L,a0:L,a1:L,a2:L);t4:L} > at java.lang.invoke.MethodHandleStatics.newInternalError(MethodHandleStatics.java:127) ~[?:1.8.0_282] > at java.lang.invoke.LambdaForm.compileToBytecode(LambdaForm.java:660) ~[?:1.8.0_282] > at java.lang.invoke.Invokers.callSiteForm(Invokers.java:381) ~[?:1.8.0_282] > at java.lang.invoke.Invokers.linkToTargetMethod(Invokers.java:347) ~[?:1.8.0_282] > at java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(MethodHandleNatives.java:314) ~[?:1.8.0_282] > at java.lang.invoke.MethodHandleNatives.linkCallSite(MethodHandleNatives.java:297) ~[?:1.8.0_282] > at org.apache.pulsar.client.impl.ProducerImpl.closeAsync(ProducerImpl.java:881) ~[java-instance.jar:?] > at org.apache.pulsar.functions.sink.PulsarSink$PulsarSinkProcessorBase.close(PulsarSink.java:180) ~[org.apache.pulsar-pulsar-functions-instance-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT] > at org.apache.pulsar.functions.sink.PulsarSink.close(PulsarSink.java:399) ~[org.apache.pulsar-pulsar-functions-instance-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT] > at org.apache.pulsar.functions.instance.JavaInstanceRunnable.close(JavaInstanceRunnable.java:414) [org.apache.pulsar-pulsar-functions-instance-2.8.0-SNAPSHOT.jar:?] > at org.apache.pulsar.functions.instance.JavaInstanceRunnable.run(JavaInstanceRunnable.java:289) [org.apache.pulsar-pulsar-functions-instance-2.8.0-SNAPSHOT.jar:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded > 02:51:55.646 [myTenant/myNamespace/function1-0] ERROR org.apache.pulsar.functions.runtime.thread.ThreadRuntime - Uncaught exception in thread Thread[myTenant/myNamespace/function1-0,5,LocalRunnerThreadGroup] > java.lang.OutOfMemoryError: GC overhead limit exceeded > 02:51:55.952 [main] INFO org.apache.pulsar.functions.runtime.JavaInstanceStarter - RuntimeSpawner quit, shutting down JavaInstance > 02:52:07.724 [function-timer-thread-53-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - myTenant/myNamespace/function1-java.lang.OutOfMemoryError: GC overhead limit exceeded Function Container is dead with exception.. restarting > 02:52:08.022 [function-timer-thread-53-1] INFO org.apache.pulsar.functions.runtime.thread.ThreadRuntime - Unloading JAR files for function [!!!org.apache.pulsar.functions.instance.InstanceConfig@67239bdd=>java.lang.OutOfMemoryError:GC overhead limit exceeded!!!] > 02:52:08.898 [function-timer-thread-53-1] INFO org.apache.pulsar.functions.runtime.thread.ThreadRuntime - Load JAR: /tmp/pulsar_functions/myTenant/myNamespace/function1/0/functions.jar -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
