[ 
https://issues.apache.org/jira/browse/ARTEMIS-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mario Mahovlić updated ARTEMIS-2555:
------------------------------------
    Description: 
We run Artemis embedded on our Spring service, it ran ok for extended period of 
time, however at some point we started getting timeout exceptions when 
producing messages to the queue. (stack traces for regular/large messages 
attached).

We produce both regular and large messages to the queue and we got timeouts for 
both types (large messages are ~130kb on average). Message production rate to 
the queue at the time of incident was ~100k per hour.

Artemis is running in persistent mode using file journal on disk. As mentioned 
in the title no error or warn level logs were logged on artemis server side and 
timeouts stopped after service restart.

After some debugging we came to the conclusion that either threads writing to 
the journal were blocked for an extended period of time, or journal compact 
operation lasted a long time/was blocked for some reason and held write lock on 
journal during that time. 

Unfortunately we took no thread dumps during the incident to see where exactly 
the threads were stuck. We didn't manage to find any similar incidents reported 
on these boards so we would like to check out if anyone has any other idea what 
might cause this behavior? 

  was:
We run Artemis embedded on our Spring service, it ran ok for extended period of 
time, however at some point we started getting timeout exceptions when 
producing messages to the queue. (stack traces for regular/large messages 
attached).

We produce both regular and large messages to the queue and we got timeouts for 
both types (large messages are ~130kb on average). Message production rate to 
the queue at the time of incident was ~100k per hour.

Artemis is running in persistent mode using file journal on disk. As mentioned 
in the title no error or warn level logs were logged on artemis server side and 
timeouts stopped after service restart.

After some debugging we came to the conclusion that either threads writing to 
the journal were blocked for an extended period of time, or journal compact 
operation lasted a long time/was blocked for some reason and held write lock on 
journal during that time. 

 

Unfortunately we took no thread dumps during the incident to see where exactly 
the threads were stuck. We didn't manage to find any similar incidents reported 
on these boards so we would like to check out if anyone has any other idea what 
might cause this behavior? 


> Embedded Artemis message producer times out when producing message (no 
> error/warn logged on server side)
> --------------------------------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-2555
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2555
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.6.4
>         Environment: Service is running within a docker container and folder 
> containing the journal is mapped to the host machine.
> Metrics for the node on which service was running show no disk I/O issues at 
> that time.
> Artemis version: 2.6.4, Spring boot version: 2.1.5.RELEASE
> Relevant artemis settings (rest of the settings are default):
> {noformat}
> durable: true
> max-size-bytes : 1GB
> address-full-policy: FAIL
> journal-sync-non-transactional : false
> journal-sync-transactional: false
> {noformat}
> If more info is needed we will try to provide it on request.
>            Reporter: Mario Mahovlić
>            Priority: Major
>         Attachments: artemis stack traces
>
>
> We run Artemis embedded on our Spring service, it ran ok for extended period 
> of time, however at some point we started getting timeout exceptions when 
> producing messages to the queue. (stack traces for regular/large messages 
> attached).
> We produce both regular and large messages to the queue and we got timeouts 
> for both types (large messages are ~130kb on average). Message production 
> rate to the queue at the time of incident was ~100k per hour.
> Artemis is running in persistent mode using file journal on disk. As 
> mentioned in the title no error or warn level logs were logged on artemis 
> server side and timeouts stopped after service restart.
> After some debugging we came to the conclusion that either threads writing to 
> the journal were blocked for an extended period of time, or journal compact 
> operation lasted a long time/was blocked for some reason and held write lock 
> on journal during that time. 
> Unfortunately we took no thread dumps during the incident to see where 
> exactly the threads were stuck. We didn't manage to find any similar 
> incidents reported on these boards so we would like to check out if anyone 
> has any other idea what might cause this behavior? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to