Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
Great, thanks for confirming that, and kudos to Gary for recognizing the root cause. Tim On Feb 26, 2018 2:34 PM, "neon18" wrote: Sorry for the delay, I finally found some time to try it out. I ran test with ActiveMQ 5.15.3 broker and that seems to resolve the OutOfMemoryError (OOME) issue when a whole bunch of messages are put on the non-persistent queue that exists in ActiveMQ 5.15.2 but not under ActiveMQ 5.14.5. Regards, Neon -- Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
Sorry for the delay, I finally found some time to try it out. I ran test with ActiveMQ 5.15.3 broker and that seems to resolve the OutOfMemoryError (OOME) issue when a whole bunch of messages are put on the non-persistent queue that exists in ActiveMQ 5.15.2 but not under ActiveMQ 5.14.5. Regards, Neon -- Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
Neon, Since 5.15.3 was just released and you'd no longer need to compile from source, would it be possible to upgrade to it to see if Gary's hunch about the root cause of the problem was correct and the problem is fixed in 5.15.3? Tim On Tue, Jan 16, 2018 at 4:02 AM, Gary Tully wrote: > Hi TIm, > the regression caused references to batches of serialised messages destined > for the journal to be retained by the page cache, potentially leading to an > OOM. > The temp store used to cursor non persistent messages uses the same journal > logic and would be effected in the same way. > The trigger for my hunch was that it only ocurred on 5.12.x > > On Fri, 12 Jan 2018 at 13:38 Tim Bain wrote: > > > Gary, > > > > Can you elaborate on what about this situation looks like that problem? > > Because based on the information given so far, it looked to me like the > OP > > was simply overdriving the storage to which the temp files are written. > > > > Tim > > > > On Jan 12, 2018 4:04 AM, "Gary Tully" wrote: > > > > > this looks like an instance of the regression from > > > https://issues.apache.org/jira/browse/AMQ-6815 > > > > > > On Wed, 10 Jan 2018 at 19:42 neon18 wrote: > > > > > > > We run the broker with max heap of 4G and initial of 1G (-Xmx4G > > -Xms1G). > > > > We use non-persistent messages on these particular queues (3 of them > in > > > > this > > > > test). > > > > The number of messages sent to the broker in my last "flood gate" > test > > > was > > > > around 40,000 (40k) in 5 minutes or about 8K msgs/min. After this > flood > > > of > > > > messages, the producers send messages at a much much lower rate. I > have > > > > pretty much the factory default activemq.xml with > > > > systemUsage/memoryUsage/percentOfJvmHeap=70 and queuePrefetch=20 on > > > these 3 > > > > queues. > > > > > > > > So I have seen two different scenarios when lots of non-persistent > > > messages > > > > are put on queue. > > > > 1. Async Writer Thread Shutdown errors (with no prior > warnings/errors), > > > > then > > > > OutOfMemoryErrors > > > > 2. INFO PListStore: ...tmp_storage initialized, ~10 seconds later > WARN: > > > > IOException: OutOfMemoryError: GC overhead limit exceeded ... > ActiveMQ > > > > Transport: tcp:..., then that repeats and other errors follow. Also > > there > > > > is > > > > no warnings/errors prior to the tmp_storage init info log msg. FYI: > the > > > web > > > > console was responsive until I saw the tmp_storage initialized > (KahaDB) > > > > INFO > > > > msg (~4.5 minutes into my test), then it stops responding. The last > > count > > > > of > > > > messages on queues via web console is ~30K msgs under ActiveMQ 5.15.2 > > > > broker. Under the 5.14.5 broker, I was able to see the flood of ~40K > > msgs > > > > added to the 3 queues in ~6 minutes. > > > > > > > > In more controlled testing in the past 2 days, where I clear the > > AMQ_DATA > > > > dir before each test run, I have not seen issue #1 (Async Writer > Thread > > > > Shutdown / OutOfMemoryError). I see issue #2 (OutOfMemoryError) a few > > > > seconds after KahaDB tmp_storage is initialized, then the web console > > > stop > > > > responding and lots of OoMerrors and other errors in the > activemq.log. > > > > > > > > Running with ActiveMQ 5.14.5 and 5.12.2 brokers, we do not have any > > > > OutOfMemoryErrors with this same load or higher load vs running under > > > > ActiveMQ 5.15.2. Running with 5.15.2 broker it seems like there might > > be > > > an > > > > issue with throttling the producers of the queue when the JVM hit's > the > > > > configured memoryUsage (default of 70%). > > > > > > > > Running on that thought, I did another test with > > > > systemUsage/memoryUsage/percentOfJvmHeap=50 %, but same thing > (except > > > that > > > > the OoM error occurs about 20 seconds after the tmp_storage init info > > > log. > > > > > > > > So, I ran the test again with systemUsage/memoryUsage to 20%, same > > thing, > > > > except the OoM error occurs about 40 seconds after the tmp_storage > init > > > > info > > > > log. This time, I also monitored the memory percent used and temp > > memory > > > > percent used via the web console. right after I see the tmp_storage > > init > > > > info log, I can see memUsed=39% tempUsed=1%, ~10 seconds later > > > memUsed=56% > > > > tempUsed=2%, ~10 seconds later memUsed=69% tempUsed=2%, next refresh > > > failed > > > > and of course in the activemq.log I see the OutOfMemoryErrors and > other > > > > warnings and errors appearing in the log. > > > > > > > > Also, I grepped my old logs for "Journal failed" and did see some > > > results, > > > > but they happen after the a few OutOfMemoryErrors, so I did not > include > > > > them > > > > in this thread. > > > > > > > > I can pretty reliably recreate the problem in about 6 minutes (with a > > > clean > > > > amq_data_dir) running ActiveMQ 5.15.2 broker and no issues under > 5.14.5 > > > or > > > > 5.12.2 brokers. > > > > > > > > Regards, > > >
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
Hi TIm, the regression caused references to batches of serialised messages destined for the journal to be retained by the page cache, potentially leading to an OOM. The temp store used to cursor non persistent messages uses the same journal logic and would be effected in the same way. The trigger for my hunch was that it only ocurred on 5.12.x On Fri, 12 Jan 2018 at 13:38 Tim Bain wrote: > Gary, > > Can you elaborate on what about this situation looks like that problem? > Because based on the information given so far, it looked to me like the OP > was simply overdriving the storage to which the temp files are written. > > Tim > > On Jan 12, 2018 4:04 AM, "Gary Tully" wrote: > > > this looks like an instance of the regression from > > https://issues.apache.org/jira/browse/AMQ-6815 > > > > On Wed, 10 Jan 2018 at 19:42 neon18 wrote: > > > > > We run the broker with max heap of 4G and initial of 1G (-Xmx4G > -Xms1G). > > > We use non-persistent messages on these particular queues (3 of them in > > > this > > > test). > > > The number of messages sent to the broker in my last "flood gate" test > > was > > > around 40,000 (40k) in 5 minutes or about 8K msgs/min. After this flood > > of > > > messages, the producers send messages at a much much lower rate. I have > > > pretty much the factory default activemq.xml with > > > systemUsage/memoryUsage/percentOfJvmHeap=70 and queuePrefetch=20 on > > these 3 > > > queues. > > > > > > So I have seen two different scenarios when lots of non-persistent > > messages > > > are put on queue. > > > 1. Async Writer Thread Shutdown errors (with no prior warnings/errors), > > > then > > > OutOfMemoryErrors > > > 2. INFO PListStore: ...tmp_storage initialized, ~10 seconds later WARN: > > > IOException: OutOfMemoryError: GC overhead limit exceeded ... ActiveMQ > > > Transport: tcp:..., then that repeats and other errors follow. Also > there > > > is > > > no warnings/errors prior to the tmp_storage init info log msg. FYI: the > > web > > > console was responsive until I saw the tmp_storage initialized (KahaDB) > > > INFO > > > msg (~4.5 minutes into my test), then it stops responding. The last > count > > > of > > > messages on queues via web console is ~30K msgs under ActiveMQ 5.15.2 > > > broker. Under the 5.14.5 broker, I was able to see the flood of ~40K > msgs > > > added to the 3 queues in ~6 minutes. > > > > > > In more controlled testing in the past 2 days, where I clear the > AMQ_DATA > > > dir before each test run, I have not seen issue #1 (Async Writer Thread > > > Shutdown / OutOfMemoryError). I see issue #2 (OutOfMemoryError) a few > > > seconds after KahaDB tmp_storage is initialized, then the web console > > stop > > > responding and lots of OoMerrors and other errors in the activemq.log. > > > > > > Running with ActiveMQ 5.14.5 and 5.12.2 brokers, we do not have any > > > OutOfMemoryErrors with this same load or higher load vs running under > > > ActiveMQ 5.15.2. Running with 5.15.2 broker it seems like there might > be > > an > > > issue with throttling the producers of the queue when the JVM hit's the > > > configured memoryUsage (default of 70%). > > > > > > Running on that thought, I did another test with > > > systemUsage/memoryUsage/percentOfJvmHeap=50 %, but same thing (except > > that > > > the OoM error occurs about 20 seconds after the tmp_storage init info > > log. > > > > > > So, I ran the test again with systemUsage/memoryUsage to 20%, same > thing, > > > except the OoM error occurs about 40 seconds after the tmp_storage init > > > info > > > log. This time, I also monitored the memory percent used and temp > memory > > > percent used via the web console. right after I see the tmp_storage > init > > > info log, I can see memUsed=39% tempUsed=1%, ~10 seconds later > > memUsed=56% > > > tempUsed=2%, ~10 seconds later memUsed=69% tempUsed=2%, next refresh > > failed > > > and of course in the activemq.log I see the OutOfMemoryErrors and other > > > warnings and errors appearing in the log. > > > > > > Also, I grepped my old logs for "Journal failed" and did see some > > results, > > > but they happen after the a few OutOfMemoryErrors, so I did not include > > > them > > > in this thread. > > > > > > I can pretty reliably recreate the problem in about 6 minutes (with a > > clean > > > amq_data_dir) running ActiveMQ 5.15.2 broker and no issues under 5.14.5 > > or > > > 5.12.2 brokers. > > > > > > Regards, > > > > > > Neon > > > > > > > > > > > > -- > > > Sent from: > > > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html > > > > > >
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
I checked with IT guys and the disk I/O is not the issue, it can handle the data rate just fine. I did not have time in finding a way to slow the producers. Building from 5.15.x/3 branch from source is not an option at this time. I also believe [AMQ-6815 KahaDB checkpoint needs to fail fast in the event of journal write failure] is not the culprit, althrough it might explain some of the other errors that I see in the activemq.log https://issues.apache.org/jira/browse/AMQ-6815 Does ActiveMQ baseline have any load test that maybe we can simulate this scenario: e.g. send decent amount of messages to a non-persistent queue (with a relatively small JVM max heap) to test this under ActiveMQ 5.14.x vs 5.15.x? That would be better to have some type of regression test for this scenario inside ActiveMQ source code. I have more info from another experiment I did today. I changed systemUsage.memoryUsage to 1%. I see an INFO Usage Manager Memory Limit .. reached on queue://... Producers will be throttled to the rate at which messages are removed from this destination to prevent flooding it... a couple times. Then ~200ms afterward, the tmp_storage initialized INFO log. On the web console, I see Memory percent used to be 99% and 100% and Temp percent used 3%. Eventually, the broker starting having errors in the Async Writer Thread Shutdown. I did a similar test under ActiveMQ 5.14.5 broker (memoryUsage is at default 70%), and here the Memory used says under 70%, of particular note is that the 5.14.5 broker Memory percent used drops alot, from 70% to 25% when it flushes non-persistent message to the Temp storage (I can see Temp used increase). I recall a similar scenario running under ActiveMQ 5.12.2, where the Memory used decreases dramatically when it flushes non-persistent message to the Temp storage on disk. Regards, Neon -- Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
On Jan 10, 2018 12:42 PM, "neon18" wrote: We run the broker with max heap of 4G and initial of 1G (-Xmx4G -Xms1G). We use non-persistent messages on these particular queues (3 of them in this test). The number of messages sent to the broker in my last "flood gate" test was around 40,000 (40k) in 5 minutes or about 8K msgs/min. After this flood of messages, the producers send messages at a much much lower rate. I have pretty much the factory default activemq.xml with systemUsage/memoryUsage/percentOfJvmHeap=70 and queuePrefetch=20 on these 3 queues. Thanks for this info. 4GB max heap is fine, I wanted to make sure you weren't running with something too small like 256MB. 8k msgs/min is 125/sec. How big are they? And what storage type (local SCSI-attached SSD, NFS share, AWS EBS volume, whatever) are you using for the data directory, and what is its throughput capacity? I was originally thinking you filled 4GB of RAM in 6 seconds instead of 6 minutes, and I was thinking that your storage solution might not be capable of accepting over 5Gbps. Even so, you describe running through 2GB of RAM in 20 seconds (under the 50% scenario) and 3.2GB in 40 seconds (under the 20% scenario), which both equate to over 500Mbps. So I'd like to pull this thread to make sure your storage really can accept writes at the rate you're asking it to do them. So I have seen two different scenarios when lots of non-persistent messages are put on queue. 1. Async Writer Thread Shutdown errors (with no prior warnings/errors), then OutOfMemoryErrors 2. INFO PListStore: ...tmp_storage initialized, ~10 seconds later WARN: IOException: OutOfMemoryError: GC overhead limit exceeded ... ActiveMQ Transport: tcp:..., then that repeats and other errors follow. Also there is no warnings/errors prior to the tmp_storage init info log msg. FYI: the web console was responsive until I saw the tmp_storage initialized (KahaDB) INFO msg (~4.5 minutes into my test), then it stops responding. The last count of messages on queues via web console is ~30K msgs under ActiveMQ 5.15.2 broker. Under the 5.14.5 broker, I was able to see the flood of ~40K msgs added to the 3 queues in ~6 minutes. In more controlled testing in the past 2 days, where I clear the AMQ_DATA dir before each test run, I have not seen issue #1 (Async Writer Thread Shutdown / OutOfMemoryError). I see issue #2 (OutOfMemoryError) a few seconds after KahaDB tmp_storage is initialized, then the web console stop responding and lots of OoMerrors and other errors in the activemq.log. For now, let's focus on issue #2 and hope that #1 was somehow a secondary failure caused by #2 (so fixing #2 fixes #1). Once we have #2 fixed, we can address #1 if it reoccurs. Running with ActiveMQ 5.14.5 and 5.12.2 brokers, we do not have any OutOfMemoryErrors with this same load or higher load vs running under ActiveMQ 5.15.2. Running with 5.15.2 broker it seems like there might be an issue with throttling the producers of the queue when the JVM hit's the configured memoryUsage (default of 70%). Running on that thought, I did another test with systemUsage/memoryUsage/percentOfJvmHeap=50 %, but same thing (except that the OoM error occurs about 20 seconds after the tmp_storage init info log. So, I ran the test again with systemUsage/memoryUsage to 20%, same thing, except the OoM error occurs about 40 seconds after the tmp_storage init info log. This time, I also monitored the memory percent used and temp memory percent used via the web console. right after I see the tmp_storage init info log, I can see memUsed=39% tempUsed=1%, ~10 seconds later memUsed=56% tempUsed=2%, ~10 seconds later memUsed=69% tempUsed=2%, next refresh failed and of course in the activemq.log I see the OutOfMemoryErrors and other warnings and errors appearing in the log. If you send your messages slower but the same total number, does everything work as expected? Also, I grepped my old logs for "Journal failed" and did see some results, but they happen after the a few OutOfMemoryErrors, so I did not include them in this thread. If the OOME is thrown from within that code, those could even be caused by the OOME. As long as the OOME happened first, we can ignore them as you've suggested. Just be aware that they indicate that the writer thread is dead (and they caise it to kill itself, IIRC), so if they happen first then they could explain why it's not running, so watch for that. I can pretty reliably recreate the problem in about 6 minutes (with a clean amq_data_dir) running ActiveMQ 5.15.2 broker and no issues under 5.14.5 or 5.12.2 brokers. Is building the 5.15.3 branch from source to see if the fix Gary referred an option? I'm not proposing you use that in operations, just that you try it in a test environment to see if it fixes the problem. Regards, Neon -- Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
Gary, Can you elaborate on what about this situation looks like that problem? Because based on the information given so far, it looked to me like the OP was simply overdriving the storage to which the temp files are written. Tim On Jan 12, 2018 4:04 AM, "Gary Tully" wrote: > this looks like an instance of the regression from > https://issues.apache.org/jira/browse/AMQ-6815 > > On Wed, 10 Jan 2018 at 19:42 neon18 wrote: > > > We run the broker with max heap of 4G and initial of 1G (-Xmx4G -Xms1G). > > We use non-persistent messages on these particular queues (3 of them in > > this > > test). > > The number of messages sent to the broker in my last "flood gate" test > was > > around 40,000 (40k) in 5 minutes or about 8K msgs/min. After this flood > of > > messages, the producers send messages at a much much lower rate. I have > > pretty much the factory default activemq.xml with > > systemUsage/memoryUsage/percentOfJvmHeap=70 and queuePrefetch=20 on > these 3 > > queues. > > > > So I have seen two different scenarios when lots of non-persistent > messages > > are put on queue. > > 1. Async Writer Thread Shutdown errors (with no prior warnings/errors), > > then > > OutOfMemoryErrors > > 2. INFO PListStore: ...tmp_storage initialized, ~10 seconds later WARN: > > IOException: OutOfMemoryError: GC overhead limit exceeded ... ActiveMQ > > Transport: tcp:..., then that repeats and other errors follow. Also there > > is > > no warnings/errors prior to the tmp_storage init info log msg. FYI: the > web > > console was responsive until I saw the tmp_storage initialized (KahaDB) > > INFO > > msg (~4.5 minutes into my test), then it stops responding. The last count > > of > > messages on queues via web console is ~30K msgs under ActiveMQ 5.15.2 > > broker. Under the 5.14.5 broker, I was able to see the flood of ~40K msgs > > added to the 3 queues in ~6 minutes. > > > > In more controlled testing in the past 2 days, where I clear the AMQ_DATA > > dir before each test run, I have not seen issue #1 (Async Writer Thread > > Shutdown / OutOfMemoryError). I see issue #2 (OutOfMemoryError) a few > > seconds after KahaDB tmp_storage is initialized, then the web console > stop > > responding and lots of OoMerrors and other errors in the activemq.log. > > > > Running with ActiveMQ 5.14.5 and 5.12.2 brokers, we do not have any > > OutOfMemoryErrors with this same load or higher load vs running under > > ActiveMQ 5.15.2. Running with 5.15.2 broker it seems like there might be > an > > issue with throttling the producers of the queue when the JVM hit's the > > configured memoryUsage (default of 70%). > > > > Running on that thought, I did another test with > > systemUsage/memoryUsage/percentOfJvmHeap=50 %, but same thing (except > that > > the OoM error occurs about 20 seconds after the tmp_storage init info > log. > > > > So, I ran the test again with systemUsage/memoryUsage to 20%, same thing, > > except the OoM error occurs about 40 seconds after the tmp_storage init > > info > > log. This time, I also monitored the memory percent used and temp memory > > percent used via the web console. right after I see the tmp_storage init > > info log, I can see memUsed=39% tempUsed=1%, ~10 seconds later > memUsed=56% > > tempUsed=2%, ~10 seconds later memUsed=69% tempUsed=2%, next refresh > failed > > and of course in the activemq.log I see the OutOfMemoryErrors and other > > warnings and errors appearing in the log. > > > > Also, I grepped my old logs for "Journal failed" and did see some > results, > > but they happen after the a few OutOfMemoryErrors, so I did not include > > them > > in this thread. > > > > I can pretty reliably recreate the problem in about 6 minutes (with a > clean > > amq_data_dir) running ActiveMQ 5.15.2 broker and no issues under 5.14.5 > or > > 5.12.2 brokers. > > > > Regards, > > > > Neon > > > > > > > > -- > > Sent from: > > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html > > >
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
this looks like an instance of the regression from https://issues.apache.org/jira/browse/AMQ-6815 On Wed, 10 Jan 2018 at 19:42 neon18 wrote: > We run the broker with max heap of 4G and initial of 1G (-Xmx4G -Xms1G). > We use non-persistent messages on these particular queues (3 of them in > this > test). > The number of messages sent to the broker in my last "flood gate" test was > around 40,000 (40k) in 5 minutes or about 8K msgs/min. After this flood of > messages, the producers send messages at a much much lower rate. I have > pretty much the factory default activemq.xml with > systemUsage/memoryUsage/percentOfJvmHeap=70 and queuePrefetch=20 on these 3 > queues. > > So I have seen two different scenarios when lots of non-persistent messages > are put on queue. > 1. Async Writer Thread Shutdown errors (with no prior warnings/errors), > then > OutOfMemoryErrors > 2. INFO PListStore: ...tmp_storage initialized, ~10 seconds later WARN: > IOException: OutOfMemoryError: GC overhead limit exceeded ... ActiveMQ > Transport: tcp:..., then that repeats and other errors follow. Also there > is > no warnings/errors prior to the tmp_storage init info log msg. FYI: the web > console was responsive until I saw the tmp_storage initialized (KahaDB) > INFO > msg (~4.5 minutes into my test), then it stops responding. The last count > of > messages on queues via web console is ~30K msgs under ActiveMQ 5.15.2 > broker. Under the 5.14.5 broker, I was able to see the flood of ~40K msgs > added to the 3 queues in ~6 minutes. > > In more controlled testing in the past 2 days, where I clear the AMQ_DATA > dir before each test run, I have not seen issue #1 (Async Writer Thread > Shutdown / OutOfMemoryError). I see issue #2 (OutOfMemoryError) a few > seconds after KahaDB tmp_storage is initialized, then the web console stop > responding and lots of OoMerrors and other errors in the activemq.log. > > Running with ActiveMQ 5.14.5 and 5.12.2 brokers, we do not have any > OutOfMemoryErrors with this same load or higher load vs running under > ActiveMQ 5.15.2. Running with 5.15.2 broker it seems like there might be an > issue with throttling the producers of the queue when the JVM hit's the > configured memoryUsage (default of 70%). > > Running on that thought, I did another test with > systemUsage/memoryUsage/percentOfJvmHeap=50 %, but same thing (except that > the OoM error occurs about 20 seconds after the tmp_storage init info log. > > So, I ran the test again with systemUsage/memoryUsage to 20%, same thing, > except the OoM error occurs about 40 seconds after the tmp_storage init > info > log. This time, I also monitored the memory percent used and temp memory > percent used via the web console. right after I see the tmp_storage init > info log, I can see memUsed=39% tempUsed=1%, ~10 seconds later memUsed=56% > tempUsed=2%, ~10 seconds later memUsed=69% tempUsed=2%, next refresh failed > and of course in the activemq.log I see the OutOfMemoryErrors and other > warnings and errors appearing in the log. > > Also, I grepped my old logs for "Journal failed" and did see some results, > but they happen after the a few OutOfMemoryErrors, so I did not include > them > in this thread. > > I can pretty reliably recreate the problem in about 6 minutes (with a clean > amq_data_dir) running ActiveMQ 5.15.2 broker and no issues under 5.14.5 or > 5.12.2 brokers. > > Regards, > > Neon > > > > -- > Sent from: > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html >
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
We run the broker with max heap of 4G and initial of 1G (-Xmx4G -Xms1G). We use non-persistent messages on these particular queues (3 of them in this test). The number of messages sent to the broker in my last "flood gate" test was around 40,000 (40k) in 5 minutes or about 8K msgs/min. After this flood of messages, the producers send messages at a much much lower rate. I have pretty much the factory default activemq.xml with systemUsage/memoryUsage/percentOfJvmHeap=70 and queuePrefetch=20 on these 3 queues. So I have seen two different scenarios when lots of non-persistent messages are put on queue. 1. Async Writer Thread Shutdown errors (with no prior warnings/errors), then OutOfMemoryErrors 2. INFO PListStore: ...tmp_storage initialized, ~10 seconds later WARN: IOException: OutOfMemoryError: GC overhead limit exceeded ... ActiveMQ Transport: tcp:..., then that repeats and other errors follow. Also there is no warnings/errors prior to the tmp_storage init info log msg. FYI: the web console was responsive until I saw the tmp_storage initialized (KahaDB) INFO msg (~4.5 minutes into my test), then it stops responding. The last count of messages on queues via web console is ~30K msgs under ActiveMQ 5.15.2 broker. Under the 5.14.5 broker, I was able to see the flood of ~40K msgs added to the 3 queues in ~6 minutes. In more controlled testing in the past 2 days, where I clear the AMQ_DATA dir before each test run, I have not seen issue #1 (Async Writer Thread Shutdown / OutOfMemoryError). I see issue #2 (OutOfMemoryError) a few seconds after KahaDB tmp_storage is initialized, then the web console stop responding and lots of OoMerrors and other errors in the activemq.log. Running with ActiveMQ 5.14.5 and 5.12.2 brokers, we do not have any OutOfMemoryErrors with this same load or higher load vs running under ActiveMQ 5.15.2. Running with 5.15.2 broker it seems like there might be an issue with throttling the producers of the queue when the JVM hit's the configured memoryUsage (default of 70%). Running on that thought, I did another test with systemUsage/memoryUsage/percentOfJvmHeap=50 %, but same thing (except that the OoM error occurs about 20 seconds after the tmp_storage init info log. So, I ran the test again with systemUsage/memoryUsage to 20%, same thing, except the OoM error occurs about 40 seconds after the tmp_storage init info log. This time, I also monitored the memory percent used and temp memory percent used via the web console. right after I see the tmp_storage init info log, I can see memUsed=39% tempUsed=1%, ~10 seconds later memUsed=56% tempUsed=2%, ~10 seconds later memUsed=69% tempUsed=2%, next refresh failed and of course in the activemq.log I see the OutOfMemoryErrors and other warnings and errors appearing in the log. Also, I grepped my old logs for "Journal failed" and did see some results, but they happen after the a few OutOfMemoryErrors, so I did not include them in this thread. I can pretty reliably recreate the problem in about 6 minutes (with a clean amq_data_dir) running ActiveMQ 5.15.2 broker and no issues under 5.14.5 or 5.12.2 brokers. Regards, Neon -- Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
Hmmm, I think I may have misinterpreted a few of the things you've written, so I want to take a step back and recalibrate my understanding of the behavior you're seeing. On Tue, Jan 9, 2018 at 1:04 PM, neon18 wrote: > > Do you get the "thread shut down" message on the first message enqueued, > or > > only after a certain number of messages are sent? > I believe, after a large number of of messages have been enqueued, is when > it enters the OutOfMemoryError state. In another test run, I did not see > the > "Async Writer Thread Shutdown error", just an OutOfMemoryError. > i.e. I see an INFO PListStore:[.../data/localhost/tmp_storage] initialized > ... | ActiveMQ BrokerService.worker.1 message > after about 15 seconds (with no longs in between), I see a WARN Trasnport > Conection to: tcp://... failed: java.io.IOException: Unexpected error > occurred: java.lang.OutOfMemoryError: Java heap space | > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > Transport: tcp:///... > So I thought I had understood that you started sending messages, and at some point you started seeing "thread shutdown" messages in the logs, and only after a large number of those, you finally saw an OutOfMemoryError. That series of events seems like a straightforward problem: the thread that would be relieving memory pressure isn't running, so it doesn't relieve that pressure, and eventually you run out of memory. Simple, straightforward, and the only question is why the thread isn't running when the time comes. But the description you just laid out 1) says that the "thread shutdown" message doesn't necessarily occur at all under some scenarios, and 2) doesn't clearly state the ordering of when the OOM occurs relative to the "thread shutdown" error when they do both occur, which calls into question my understanding that the "thread shutdown" messages were the cause of the OOM and not its result. So first, can you please clarify the timing of the OOM vs. the "thread shutdown" errors when they both occur? And second, we may have to look at other possibilities for why you're running out of memory; I'll try to poke at some of those further down in this response. > So in this scenario, maybe it is taking longer to move the in-memory > message > to the temp storage as too many messages are coming in under 5.15 broker > where as the 5.14 and 5.12 broker's might be "blocking" or throttling > incoming messages to queue to allow for messages to be moved to temp > storage? > I don't see any obvious new code paths that explicitly remove any throttling, but I don't claim that the quick look I took this evening is fully complete. I do see a code path that's new in 5.15.x that would allow the thread to be shut down if another error (either an OOM or something else) occurred while writing messages to disk, but that code path would result in a WARN message in the logs that starts with "Journal failed while writing at:" and your earlier response said that you saw no other errors in the logs. So if what you wrote earlier is accurate, I don't think this new code path could explain what you're seeing. > > The temp store does indeed use KahaDB as its store type. Do you see > KahaDB > > data files (with a .log extension) in that directory? > Yes > > Are they recreated if you stop the broker, delete them, and start the > > broker? And are any bytes > > written to them after the broker's initial startup? > I have not tried this, should the message producer for the queue be on and > putting stuff into the it? > My goal was simply to confirm that bytes were being written to the files when messages were being enqueued, eliminating any bytes that might have been written as part of the process of creating those files with no messages in them, since there might be a small amount of overhead bytes in an "empty" file. So I'd suggest you do the steps I listed without any messages being enqueued, give the broker a few seconds (or even minutes, why not) to let everything stabilize, then start producing messages, and see whether bytes are eventually written to the files after that point. > > It seems as though either the thread that dumps messages out of memory > and > > into the temp store isn't started at all, or is dying at some point, and > > it > > would be useful to know which it is. Can you attach JConsole to the > broker > > process as soon as possible after it starts (before sending any messages) > > and look for the Async Writer Thread in the list of threads? If it's > > there, > > then try to figure out when it dies (e.g. after the first message is > > sent). > When I startup broker w/o any producer, there is no "Async Writer Thread", > only an "ActiveMQ Data File Writer" thread > That's the one you want. When I wrote my previous response I didn't have access to the source code, so I didn't have a way to confirm the exact name of the thread, but what you quoted is the thread in question. > After about 100 messages send to the non
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
> Do you get the "thread shut down" message on the first message enqueued, or > only after a certain number of messages are sent? I believe, after a large number of of messages have been enqueued, is when it enters the OutOfMemoryError state. In another test run, I did not see the "Async Writer Thread Shutdown error", just an OutOfMemoryError. i.e. I see an INFO PListStore:[.../data/localhost/tmp_storage] initialized ... | ActiveMQ BrokerService.worker.1 message after about 15 seconds (with no longs in between), I see a WARN Trasnport Conection to: tcp://... failed: java.io.IOException: Unexpected error occurred: java.lang.OutOfMemoryError: Java heap space | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///... So in this scenario, maybe it is taking longer to move the in-memory message to the temp storage as too many messages are coming in under 5.15 broker where as the 5.14 and 5.12 broker's might be "blocking" or throttling incoming messages to queue to allow for messages to be moved to temp storage? > Have you set any options in your activemq.xml that might change the > broker's behavior regarding the temp store? If you can, please post your > configuration. I am pretty much using the factory stock activemq.xml, with an increased maximumConnections and a few slow workoff queues with queuePrefetch=1 > The temp store does indeed use KahaDB as its store type. Do you see KahaDB > data files (with a .log extension) in that directory? Yes > Are they recreated if you stop the broker, delete them, and start the > broker? And are any bytes > written to them after the broker's initial startup? I have not tried this, should the message producer for the queue be on and putting stuff into the it? > It seems as though either the thread that dumps messages out of memory and > into the temp store isn't started at all, or is dying at some point, and > it > would be useful to know which it is. Can you attach JConsole to the broker > process as soon as possible after it starts (before sending any messages) > and look for the Async Writer Thread in the list of threads? If it's > there, > then try to figure out when it dies (e.g. after the first message is > sent). When I startup broker w/o any producer, there is no "Async Writer Thread", only an "ActiveMQ Data File Writer" thread After about 100 messages send to the non-persistent queue (the messages are all still in memory), there is still no "Async Writer Thread", , only an "ActiveMQ Data File Writer" thread and all is fine (e.g. consumers can de-queue messages). Only when I open the flood gates and push lots of messages on the non-persistent queue does the broker crumble with the OutOfMemoryError then dies in the 5.15 broker. Thanks for looking into this issue with me. Sincerely, Neon -- Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
OK, this is a good start, thanks for working to narrow down the variables a bit. Do you get the "thread shut down" message on the first message enqueued, or only after a certain number of messages are sent? Have you set any options in your activemq.xml that might change the broker's behavior regarding the temp store? If you can, please post your configuration. On Jan 8, 2018 7:57 AM, "neon18" wrote: I have tried out a few things to reduce the variables, one step at a time. 1. I upgrade the ActiveMQ client libraries to 5.15.2 to match the server. This results in same issue (OutOfMemoryError on ActiveMQ 5.15.2 server). 2. I change the memoryUsage percentOfJvmHeap from 70% to 50%, restarted the server but same. Issue. 3. I downgraded to ActiveMQ 5.14.5 server, and all is good (also when I downgraded to ActiveMQ 5.12.2 - all is good) There are not other errors in activemq.log. I am using a non-persistent queue, but when memory hits the default the memory usage percent of 70%, , the server *should* flush to the temp storage. So I believe ActiveMQ 5.15 has a bug when flushing in-memory (non-persistent) messages to the temp storage under /localhost/tmp_storage. I am guessing the temp storage is using KahaDB underneath). The temp store does indeed use KahaDB as its store type. Do you see KahaDB data files (with a .log extension) in that directory? Are they recreated if you stop the broker, delete them, and start the broker? And are any bytes written to them after the broker's initial startup? It seems as though either the thread that dumps messages out of memory and into the temp store isn't started at all, or is dying at some point, and it would be useful to know which it is. Can you attach JConsole to the broker process as soon as possible after it starts (before sending any messages) and look for the Async Writer Thread in the list of threads? If it's there, then try to figure out when it dies (e.g. after the first message is sent). With no messages (or with a few messages before JVM max memory), ActiveMQ 5.15.2 server is fine, it responds on the Web console, producer and consumer clients can add to queue and dequeue messages fine. After the OutOfMemoryError, the server (broker) no longer responds to any client and the web console is NOT responsive. All bets are off after an OOM in a Java process; the only safe option is to restart the process. So that doesn't surprise me. Regards, Neon -- Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
I have tried out a few things to reduce the variables, one step at a time. 1. I upgrade the ActiveMQ client libraries to 5.15.2 to match the server. This results in same issue (OutOfMemoryError on ActiveMQ 5.15.2 server). 2. I change the memoryUsage percentOfJvmHeap from 70% to 50%, restarted the server but same. Issue. 3. I downgraded to ActiveMQ 5.14.5 server, and all is good (also when I downgraded to ActiveMQ 5.12.2 - all is good) There are not other errors in activemq.log. I am using a non-persistent queue, but when memory hits the default the memory usage percent of 70%, , the server *should* flush to the temp storage. So I believe ActiveMQ 5.15 has a bug when flushing in-memory (non-persistent) messages to the temp storage under /localhost/tmp_storage. I am guessing the temp storage is using KahaDB underneath). With no messages (or with a few messages before JVM max memory), ActiveMQ 5.15.2 server is fine, it responds on the Web console, producer and consumer clients can add to queue and dequeue messages fine. After the OutOfMemoryError, the server (broker) no longer responds to any client and the web console is NOT responsive. Regards, Neon -- Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
I'm not aware of any incompatibility between a 5.12.2 client and a 5.15.2 broker, but we don't explicitly test compatibility so it's possible. But I can't think of an obvious mechanism for an older client to cause a store writer thread in the broker to exit, so a compatibility problem isn't where I'd start looking. The thread that's shut down writes content to disk in KahaDB. Presumably this is a persistent message and you're using KahaDB as your persistence store implementation. Are there any errors in the broker's logs prior to the one you quoted, which might indicate problems with KahaDB or the writer thread? Can you put messages on the queue via the web console? And if you disconnect all clients and then restart the broker, can you put messages on the queue via the web console? Let's make sure your KahaDB confirmation is healthy with no clients at all, before introducing the extra variable of an external producer (let alone one on an earlier version of ActiveMQ). Tim On Jan 5, 2018 9:03 AM, "neon18" wrote: > We recently upgraded to ActiveMQ server 5.15.2 from 5.12.2 and when putting > some messages onto a Queue, we get the following error > > ERROR | Caught an Exception adding a message: ActiveMQObjectMessage > {commandId = 2705, ...} first to FilePendingMessageCursor | > org.apache.acivemq.broker.region.cursors.FIlePendingMEssageCursor | > ActiveMQ > Transport: tcp:// > java.io.IOException: Async Writer Thread Shutdown > at > org.apache.activemq.store.kahadb.disk.journal.DataFileAppender.enqueue( > DataFileAppender.java:164)[activemq-kahadb-store-5.15.2.jar:5.15.2] > at > org.apache.activemq.store.kahadb.disk.journal.DataFileAppender.storeItem( > DataFileAppender.java:129)[activemq-kahadb-store-5.15.2.jar:5.15.2] > at > org.apache.activemq.store.kahadb.disk.journal.Journal. > write(Journal.java:958)[activemq-kahadb-store-5.15.2.jar:5.15.2] > at > org.apache.activemq.store.kahadb.plist.PListStoreImpl. > write(PListStoreImpl.java:444)[activemq-kahadb-store-5.15.2.jar:5.15.2] > at > org.apache.activemq.store.kahadb.plist.PListImpl.write( > PListImpl.java:105)[activemq-kahadb-store-5.15.2.jar:5.15.2] > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor. > tryAddMessageLast(FilePendingMessageCursor.java: > 237)[activemq-broker-5.15.2.jar:5.15.2] > at > org.apache.activemq.broker.region.cursors.StoreQueueCursor. > tryAddMessageLast(StoreQueueCursor.java:100)[activemq-broker-5.15.2.jar:5. > 15.2] > at > org.apache.activemq.broker.region.Queue.tryCursorAdd( > Queue.java:1868)[activemq-broker-5.15.2.jar:5.15.2] > ... > > When we were using ActiveMQ 5.12.2 server we did not have this issue. > > Any ideas on why this is happening (there are lots of the above errors in > the activemq.log) and how to resolve it? > > Eventually, the 5.15.2 server logs a WARN | Transport Connection to > tcp://... failed java.io.Exception: Unexpected error occurred: > java.lang.OutOfMemoryError: Java heap space | > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > Transport: tcp:///... > > The client that submits to the queue is still using the > activemq-client-5.12.2.jar (do we need to also upgrade the client to use > the > matching version as the server? > > Regards, > > Neon > > > > > > > -- > Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User- > f2341805.html >