[ https://issues.apache.org/jira/browse/KAFKA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606303#comment-16606303 ]
John Roesler commented on KAFKA-7214: ------------------------------------- Hi [~habdank], If I understood the scenario, you get stability problems when you increase the load by 50x but only increase memory 6x. This doesn't seem too surprising. Following a simple linear projection, if your app is healthy at 100 msg/s with 300 MB heap, when you increase the load by 50x (if it were heap-constrained to begin with) you also need 50x the heap, which puts you at 15GB. The fact that you are healthy at 1/3 this, or 5GB, indicates that it actually wasn't heap constrained to begin with. It seems like your initial hypothesis was that you needed 3MB per msg/sec, and your new hypothesis is that you need 1MB per msg/sec. So if you scale up again, you can use this as a starting point and adjust down or up, depending on how the system performs. Note that on the lower end of the spectrum, the overhead will tend to dominate, so I'm not sure if you can run 100 msg/s in only 100MB of heap, and you almost certainly cannot run 1 msg/s in 1MB of heap. Scaling up, the data itself will dominate the heap, but you'll find that there is also a limit to this thinking, as Java performs poorly with very large heaps (like terrabyte range). About your analysis: > 5000 Msg/s ~ 150 000 Mgs/30 sec ~ 150 MB This is a good lower bound, since you know that at a minimum all the live messages must reside in memory, but it is not likely to be a tight lower bound. This assumes that there is no overhead at all. That is, that the only thing in the heap is the messages themselves. Which cannot be true, since the JVM has overhead of its own, and Streams and the Clients have their own data structures to maintain, and each message is also resident a couple of different times due to serialization and deserialization, and finally because Java is memory managed, so every object continues to occupy heap after it is live until it gets collected. Does this help? -John > Mystic FATAL error > ------------------ > > Key: KAFKA-7214 > URL: https://issues.apache.org/jira/browse/KAFKA-7214 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 0.11.0.3, 1.1.1 > Reporter: Seweryn Habdank-Wojewodzki > Priority: Critical > > Dears, > Very often at startup of the streaming application I got exception: > {code} > Exception caught in process. taskId=0_1, processor=KSTREAM-SOURCE-0000000000, > topic=my_instance_medium_topic, partition=1, offset=198900203; > [org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:212), > > org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(AssignedTasks.java:347), > > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:420), > > org.apache.kafka.streams.processor.internals.AssignedTasks.process(AssignedTasks.java:339), > > org.apache.kafka.streams.processor.internals.StreamThread.processAndPunctuate(StreamThread.java:648), > > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:513), > > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:482), > > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:459)] > in thread > my_application-my_instance-my_instance_medium-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62 > {code} > and then (without shutdown request from my side): > {code} > 2018-07-30 07:45:02 [ar313] [INFO ] StreamThread:912 - stream-thread > [my_application-my_instance-my_instance-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62] > State transition from PENDING_SHUTDOWN to DEAD. > {code} > What is this? > How to correctly handle it? > Thanks in advance for help. -- This message was sent by Atlassian JIRA (v7.6.3#76005)