[ 
https://issues.apache.org/jira/browse/KAFKA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606303#comment-16606303
 ] 

John Roesler commented on KAFKA-7214:
-------------------------------------

Hi [~habdank],

If I understood the scenario, you get stability problems when you increase the 
load by 50x but only increase memory 6x. This doesn't seem too surprising.

Following a simple linear projection, if your app is healthy at 100 msg/s with 
300 MB heap, when you increase the load by 50x (if it were heap-constrained to 
begin with) you also need 50x the heap, which puts you at 15GB. The fact that 
you are healthy at 1/3 this, or 5GB, indicates that it actually wasn't heap 
constrained to begin with.

It seems like your initial hypothesis was that you needed 3MB per msg/sec, and 
your new hypothesis is that you need 1MB per msg/sec. So if you scale up again, 
you can use this as a starting point and adjust down or up, depending on how 
the system performs.

Note that on the lower end of the spectrum, the overhead will tend to dominate, 
so I'm not sure if you can run 100 msg/s in only 100MB of heap, and you almost 
certainly cannot run 1 msg/s in 1MB of heap.

Scaling up, the data itself will dominate the heap, but you'll find that there 
is also a limit to this thinking, as Java performs poorly with very large heaps 
(like terrabyte range).

 

About your analysis:

> 5000 Msg/s ~ 150 000 Mgs/30 sec ~ 150 MB

This is a good lower bound, since you know that at a minimum all the live 
messages must reside in memory, but it is not likely to be a tight lower bound.

This assumes that there is no overhead at all. That is, that the only thing in 
the heap is the messages themselves. Which cannot be true, since the JVM has 
overhead of its own, and Streams and the Clients have their own data structures 
to maintain, and each message is also resident a couple of different times due 
to serialization and deserialization, and finally because Java is memory 
managed, so every object continues to occupy heap after it is live until it 
gets collected.

 

Does this help?

-John

 

> Mystic FATAL error
> ------------------
>
>                 Key: KAFKA-7214
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7214
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.11.0.3, 1.1.1
>            Reporter: Seweryn Habdank-Wojewodzki
>            Priority: Critical
>
> Dears,
> Very often at startup of the streaming application I got exception:
> {code}
> Exception caught in process. taskId=0_1, processor=KSTREAM-SOURCE-0000000000, 
> topic=my_instance_medium_topic, partition=1, offset=198900203; 
> [org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:212),
>  
> org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(AssignedTasks.java:347),
>  
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:420),
>  
> org.apache.kafka.streams.processor.internals.AssignedTasks.process(AssignedTasks.java:339),
>  
> org.apache.kafka.streams.processor.internals.StreamThread.processAndPunctuate(StreamThread.java:648),
>  
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:513),
>  
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:482),
>  
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:459)]
>  in thread 
> my_application-my_instance-my_instance_medium-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62
> {code}
> and then (without shutdown request from my side):
> {code}
> 2018-07-30 07:45:02 [ar313] [INFO ] StreamThread:912 - stream-thread 
> [my_application-my_instance-my_instance-72ee1819-edeb-4d85-9d65-f67f7c321618-StreamThread-62]
>  State transition from PENDING_SHUTDOWN to DEAD.
> {code}
> What is this?
> How to correctly handle it?
> Thanks in advance for help.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to