[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2018-10-19 Thread Alan Protasio (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657510#comment-16657510
 ] 

Alan Protasio edited comment on AMQ-7080 at 10/20/18 4:41 AM:
--

[~gtully] [~jgenender]  [~cshannon]

Guys.. I could figure out a way to do it without any performance hit...

So.. i'm doing the other way around... I'm writing in the end of the db.free 
file the nextTxid and the sequenceSet Hash.

The nextTxid shows if the db.data and db.free are in sync and the hash show if 
it was fully written (check for partial writes)

This change is also backward compatible as if there is not this metadata in the 
end of db.free file i'm just ignoring it (setting hashcheckpoint and nextTxid 
to -1)

So, At the checkpoint I only serialize the freeList (with the metadata) into a 
ByteArrayOutputStream and do the actual write async (see storeFreeListAsync) - 
storeFreeListAsync make sure that the the bytes represent the sequence set in 
the Checkpoint time but, do the actual write async (not blocking the 
checkpoint) In the recovery path, we have ways of knowing if the db.free is 
up to date and was fully written and if it is in sync with db.data.

One thing that I found strange though is.. when a full recovery is performed 
the number of free pages can be different from the original one... In a case of 
a clean shutdown (current implementation) the free pages will be not the same 
as if we scan the whole index in a unclean shutdown.. 

For instance: in the test "testFreePageRecoveryUncleanShutdown" if we compare 
pf2.getFreePageCount() and pf.getFreePageCount() the number will not be the 
same.

So, this change has the same behaviour of a clean shutdown.. 


was (Author: alanprot):
[~gtully] [~jgenender]  [~cshannon]

Guys.. I could figure out a way to do it without any performance hit...

So.. i'm doing the other way around... I'm writing in the end of the db.free 
file the nextTxid and the sequenceSet Hash.

The nextTxid shows if the db.data and db.free are in sync and the hash show if  
there was a partial write (I create a test for both cases).

This change is also backward compatible as if there is not this metadata in the 
end of db.free file i'm just ignoring set the hash and nextTxid to -1)

So, No in the checkpoint I only serialize the freeList (with the metadata) into 
a ByteArrayOutputStream and writing the file async (see storeFreeListAsync)

storeFreeListAsync make sure that the the bytes represent the sequence set in 
the Checkpoint time but, the write can be done later In the recovery path, 
we have ways of knowing if the db.free is up to date and was fully written last 
time.

 

One thing that I found strange though is.. when a full recovery is performed 
the number of free pages can be different from the original one...

For instance: in the test "testFreePageRecoveryUncleanShutdown" if we compare 
pf2.getFreePageCount() and pf.getFreePageCount() the number will not be the 
same.

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0, 5.15.7
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a 

[jira] [Assigned] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2018-10-19 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré reassigned AMQ-7080:
-

Assignee: Jean-Baptiste Onofré

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0, 5.15.7
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2018-10-19 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré updated AMQ-7080:
--
Fix Version/s: 5.15.7
   5.16.0

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.7
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2018-10-19 Thread Alan Protasio (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657510#comment-16657510
 ] 

Alan Protasio commented on AMQ-7080:


[~gtully] [~jgenender]  [~cshannon]

Guys.. I could figure out a way to do it without any performance hit...

So.. i'm doing the other way around... I'm writing in the end of the db.free 
file the nextTxid and the sequenceSet Hash.

The nextTxid shows if the db.data and db.free are in sync and the hash show if  
there was a partial write (I create a test for both cases).

This change is also backward compatible as if there is not this metadata in the 
end of db.free file i'm just ignoring set the hash and nextTxid to -1)

So, No in the checkpoint I only serialize the freeList (with the metadata) into 
a ByteArrayOutputStream and writing the file async (see storeFreeListAsync)

storeFreeListAsync make sure that the the bytes represent the sequence set in 
the Checkpoint time but, the write can be done later In the recovery path, 
we have ways of knowing if the db.free is up to date and was fully written last 
time.

 

One thing that I found strange though is.. when a full recovery is performed 
the number of free pages can be different from the original one...

For instance: in the test "testFreePageRecoveryUncleanShutdown" if we compare 
pf2.getFreePageCount() and pf.getFreePageCount() the number will not be the 
same.

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Priority: Major
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2018-10-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657459#comment-16657459
 ] 

ASF GitHub Bot commented on AMQ-7080:
-

GitHub user alanprot opened a pull request:

https://github.com/apache/activemq/pull/313

AMQ-7080 - Keep track of free pages - Update db.free file during chec…

…kpoints

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alanprot/activemq AMQ-7080-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/activemq/pull/313.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #313


commit 1c6cb6014265482a3ed320d7440e5c180629b727
Author: Alan Protasio 
Date:   2018-10-19T20:12:45Z

AMQ-7080 - Keep track of free pages - Update db.free file during checkpoints




> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Priority: Major
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARTEMIS-1592) Clustered broker throws "java.lang.IllegalStateException: Cannot find binding for [Queue]" for auto-deleted queues

2018-10-19 Thread Johan Stenberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARTEMIS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Stenberg updated ARTEMIS-1592:

Affects Version/s: 2.6.3

> Clustered broker throws "java.lang.IllegalStateException: Cannot find binding 
> for [Queue]" for auto-deleted queues
> --
>
> Key: ARTEMIS-1592
> URL: https://issues.apache.org/jira/browse/ARTEMIS-1592
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: Broker
>Affects Versions: 2.4.0, 2.5.0, 2.6.0, 2.6.3
> Environment: Artemis 2.6.0, JDK8 64bit, Windows 7
>Reporter: Johan Stenberg
>Priority: Minor
> Attachments: ArtemisTest.java
>
>
> When a clustered, auto-created queue is auto-deleted (i.e. when the 
> consumer-count and message-count reach 0) this stack-trace can be logged:
> {noformat}
> ERROR: AMQ224037: cluster connection Failed to handle message
> java.lang.IllegalStateException: Cannot find binding for 
> queues.myQueueb4b3a157-f51c-11e7-b013-54524514640f
>   at 
> org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerClosed(ClusterConnectionImpl.java:1360)
>   at 
> org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.handleNotificationMessage(ClusterConnectionImpl.java:1046)
>   at 
> org.apache.activemq.artemis.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:1016)
>   at 
> org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1003)
>   at 
> org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:50)
>   at 
> org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1126)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> There is no functional impact associated with this message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARTEMIS-2031) Message Filter not displayed in management console for ANYCAST Queues

2018-10-19 Thread Johan Stenberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARTEMIS-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Stenberg updated ARTEMIS-2031:

Affects Version/s: (was: 2.6.2)
   2.6.3

> Message Filter not displayed in management console for ANYCAST Queues
> -
>
> Key: ARTEMIS-2031
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2031
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: Broker, Web Console
>Affects Versions: 2.6.3
>Reporter: Johan Stenberg
>Priority: Minor
> Attachments: AdminUI_Queues.png
>
>
> The Admin UI only shows the message filter for multicast but not for unicast 
> addresses.
> To reproduce, create a new broker instance and try the following code:
> {code:java}
> import java.util.concurrent.TimeUnit;
> import javax.jms.*;
> import org.apache.qpid.jms.*;
> /*
>  
>     org.apache.qpid
>     qpid-jms-client
>     0.35.0
>  
>  */
> public class Test {
>    public static void main(final String[] args) throws Exception {
>   final JmsQueue queue = new JmsQueue("myQueueWithFilter");
>   final JmsTopic topic = new JmsTopic("myTopicWithFilter");
>   try ( //
>     final Connection conn = new 
> JmsConnectionFactory("amqp://localhost").createConnection("user", "user"); //
>     final Session sess = conn.createSession(); //
>     final MessageConsumer queueConsumer = sess.createConsumer(queue, 
> "type='FOO'"); //
>     final MessageConsumer topicConsumer = sess.createConsumer(topic, 
> "type='FOO'"); //
>   ) {
>  conn.start();
>  Thread.sleep(TimeUnit.SECONDS.toMillis(60));
>   }
>    }
> }
> {code}
> The admin UI then shows both addresses but only for the MULTICAST queue it 
> displays the message filter.
>  
> !AdminUI_Queues.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Alan Protasio (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657095#comment-16657095
 ] 

Alan Protasio commented on AMQ-7082:


This is what I meant...

[https://github.com/alanprot/activemq/commit/9b96e590de8b180815bd85dc701e8d0a8f28b031]

:D

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Alan Protasio (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656995#comment-16656995
 ] 

Alan Protasio edited comment on AMQ-7082 at 10/19/18 4:15 PM:
--

So, we can lock only if the recoveredFreeList is not empty AND freeList is 
empty then ... no? This is a good trade off... we only have this hit during the 
recovery AND we need more free pages Imagine that probably allocate new 
pages will take more time than this sync...

Would be amazing to have this free pages as soon we know about them... -Maybe 
with this we AMQ-7080 stop making sense at all...-  (AMQ-7080 can still have 
the potential to avoid lots of read - On a NFS we can have a limited throughout 
- mb/s - So maybe this still good) 


was (Author: alanprot):
So, we can lock only if the recoveredFreeList is not empty AND freeList is 
empty then ... no? This is a good trade off... we only have this hit during the 
recovery AND we need more free pages Imagine that probably allocate new 
pages will take more time than this sync...

Would be amazing to have this free pages as soon we know about them... Maybe 
with this we AMQ-7080 stop making sense at all...

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Alan Protasio (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656995#comment-16656995
 ] 

Alan Protasio edited comment on AMQ-7082 at 10/19/18 4:09 PM:
--

So, we can lock only if the recoveredFreeList is not empty AND freeList is 
empty then ... no? This is a good trade off... we only have this hit during the 
recovery AND we need more free pages Imagine that probably allocate new 
pages will take more time than this sync...

Would be amazing to have this free pages as soon we know about them... Maybe 
with this we AMQ-7080 stop making sense at all...


was (Author: alanprot):
So, we can lock only if the recoveredFreeList is not empty AND freeList is 
empty then ... no? This is a good trade off... we only have this hit during the 
recovery Imagine that probably allocate new pages will take more time than 
this sync...

Would be amazing to have this free pages as soon we know about them... Maybe 
with this we AMQ-7080 stop making sense at all...

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Alan Protasio (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656995#comment-16656995
 ] 

Alan Protasio edited comment on AMQ-7082 at 10/19/18 4:09 PM:
--

So, we can lock only if the recoveredFreeList is not empty AND freeList is 
empty then ... no? This is a good trade off... we only have this hit during the 
recovery Imagine that probably allocate new pages will take more time than 
this sync...

Would be amazing to have this free pages as soon we know about them... Maybe 
with this we AMQ-7080 stop making sense at all...


was (Author: alanprot):
So, we can lock only if the recoveredFreeList is not empty then... no? This is 
a good trade off... we only have this hit during the recovery 

Would be amazing to have this free pages as soon we know about them... Maybe 
with this we AMQ-7080 stop making sense at all...

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Christopher L. Shannon (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656994#comment-16656994
 ] 

Christopher L. Shannon commented on AMQ-7082:
-

Nice work [~gtully], this should work nicely to prevent the shutdown issue. 
When I'm back at the office Monday I will take a look at all of this in more 
depth and do some testing (I've been traveling for work all week)

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Alan Protasio (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656995#comment-16656995
 ] 

Alan Protasio commented on AMQ-7082:


So, we can lock only if the recoveredFreeList is not empty then... no? This is 
a good trade off... we only have this hit during the recovery 

Would be amazing to have this free pages as soon we know about them... Maybe 
with this we AMQ-7080 stop making sense at all...

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Gary Tully (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656986#comment-16656986
 ] 

Gary Tully commented on AMQ-7082:
-

I did not want to block normal work with sync on what the recovery thread is 
doing. I think it will be constant churn there.

Typically all operations on the pageFile hold the index lock, so it can be 
mostly sync free (the exception being the optional async writer thread).

it may be worth a test though to validate :)

 

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656967#comment-16656967
 ] 

ASF subversion and git services commented on AMQ-7082:
--

Commit efa4e683bc9f4fc458ce80125304f20a742a7907 in activemq's branch 
refs/heads/master from gtully
[ https://git-wip-us.apache.org/repos/asf?p=activemq.git;h=efa4e68 ]

AMQ-7082 - fix final ref in test


> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Alan Protasio (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656969#comment-16656969
 ] 

Alan Protasio commented on AMQ-7082:


This is amazing!! :D:D:D:D:D:D:D:D

Maybe we can do a small change to synchronize the newFreePages and use is as 
soon as it has recovered values... no?

 

Something like this:

https://github.com/alanprot/activemq/commit/a9354e330e41cf4bf7e6fa21b57490a3ef22609b

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2018-10-19 Thread Alan Protasio (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656923#comment-16656923
 ] 

Alan Protasio commented on AMQ-7080:


I'm calculating a hash and saving it on metadata... (nowadays the hash is 
simply a XOR operation Maybe this is enough here...)

[https://github.com/alanprot/activemq/commit/de3b1ad9927ed20449c10afa687056322869ce00]

I'm also did some optimization in the Marshaller to try to do fewer writes... 
this speeded up a lot this operation decreasing the performance hit This 
still a WIP.. no tests yet because i'm still doing the optimization and 
measuring the performance hit. I will share the results when i have more 
concrete data...

I also created one more field in the index metadata (needsFreePageRecovery) to 
keep track if the recovery is needed. This is needed because as the recover is 
now being done in the shutdown and we are saving the db.free in the checkpoint. 
Thus, I can have a db.free that does no represent the whole free pages

Imagine:

1 -> Free pages Size (1000)

2 -> Unclean Shutdown

3 -> db.free and db.data are out of sync

4 -> Activmeq start and allocate new pages (It cannot reuse db.free)

5 -> Activemq allocate more 200 pages

6 -> Unclean Shutdown. (db.free has 200 pages)

7 -> Activemq start and  db.free and db.data ARE in sync (I can reuse the 200 
free pages of the last run)

8 -> Clean Shutdown -> I should try to recovery the free pages and remount 1200 
free pages

We can recover on every unclean shutdown... but most of the cases this will not 
be needed.. 

 

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Priority: Major
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2018-10-19 Thread Gary Tully (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656919#comment-16656919
 ] 

Gary Tully commented on AMQ-7080:
-

with https://issues.apache.org/jira/browse/AMQ-7082 the need for any change 
around ACTIVEMQ_KILL_MAXSECONDS goes away I think.

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Priority: Major
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Gary Tully (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Tully resolved AMQ-7082.
-
Resolution: Fixed

The recovery is now back in the start phase, if it completes we are good. 
Otherwise we try again the next time.

 

[~jgoodyear] - this is a take on the parallel approach, I think it makes good 
sense.

[~alanprot] - there is still a good case for checkpointing which will reduce 
the full replay window, but the perf impact will the be key determinant on that 
I think.

It would be good to gauge the impact of the second reader in your case over 
NFS, there may be need to slow down the recovery thread such that it does not 
hog the disk or cpu.

 

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656899#comment-16656899
 ] 

ASF subversion and git services commented on AMQ-7082:
--

Commit 79c74998dc1efb72b05d32f920052a1df4b6dd8e in activemq's branch 
refs/heads/master from gtully
[ https://git-wip-us.apache.org/repos/asf?p=activemq.git;h=79c7499 ]

AMQ-7082 - recover index free pages in parallel with start, merge in flush, 
clean shutdown if complete. follow up on AMQ-6590


> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-6590) KahaDB index loses track of free pages on unclean shutdown

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656900#comment-16656900
 ] 

ASF subversion and git services commented on AMQ-6590:
--

Commit 79c74998dc1efb72b05d32f920052a1df4b6dd8e in activemq's branch 
refs/heads/master from gtully
[ https://git-wip-us.apache.org/repos/asf?p=activemq.git;h=79c7499 ]

AMQ-7082 - recover index free pages in parallel with start, merge in flush, 
clean shutdown if complete. follow up on AMQ-6590


> KahaDB index loses track of free pages on unclean shutdown
> --
>
> Key: AMQ-6590
> URL: https://issues.apache.org/jira/browse/AMQ-6590
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Affects Versions: 5.14.3
>Reporter: Christopher L. Shannon
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.15.0, 5.14.4, 5.16.0, 5.15.7
>
>
> I have discovered an issue with the KahaDB index recovery after an unclean 
> shutdown (OOM error, kill -9, etc) that leads to excessive disk space usage. 
> Normally on clean shutdown the index stores the known set of free pages to 
> db.free and reads that in on start up to know which pages can be re-used.  On 
> an unclean shutdown this is not written to disk so on start up the index is 
> supposed to scan the page file to figure out all of the free pages.
> Unfortunately it turns out that this scan of the page file is being done 
> before the total page count value has been set so when the iterator is 
> created it always thinks there are 0 pages to scan.
> The end result is that every time an unclean shutdown occurs all known free 
> pages are lost and no longer tracked.  This of course means new free pages 
> have to be allocated and all of the existing space is now lost which will 
> lead to excessive index file growth over time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARTEMIS-2140) AMQP Shared Subscriptions fail because of RaceCondition

2018-10-19 Thread Johan Stenberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARTEMIS-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Stenberg updated ARTEMIS-2140:

Attachment: Artemis2140_AmqpSharedConsumerRaceConditionTest.java

> AMQP Shared Subscriptions fail because of RaceCondition
> ---
>
> Key: ARTEMIS-2140
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2140
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: AMQP, Broker
>Affects Versions: 2.6.3
>Reporter: Johan Stenberg
>Priority: Major
> Attachments: Artemis2140_AmqpSharedConsumerRaceConditionTest.java
>
>
> When multiple clients try to subscribe to the same shared subscription the 
> following exception occurs:
> {noformat}
> Okt 19, 2018 4:25:21 PM 
> org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler 
> dispatch
> WARN: AMQ119018: Binding already exists LocalQueueBinding 
> [address=topics.cat, queue=QueueImpl[name=nonDurable.MY_SUB, 
> postOffice=PostOfficeImpl 
> [server=ActiveMQServerImpl::serverUUID=cf020fcf-d3aa-11e8-ac61-54524514640f], 
> temp=false]@6425eba3, filter=null, name=nonDurable.MY_SUB, 
> clusterName=nonDurable.MY_SUBcf020fcf-d3aa-11e8-ac61-54524514640f]
> ActiveMQQueueExistsException[errorType=QUEUE_EXISTS message=AMQ119018: 
> Binding already exists LocalQueueBinding [address=topics.cat, 
> queue=QueueImpl[name=nonDurable.MY_SUB, postOffice=PostOfficeImpl 
> [server=ActiveMQServerImpl::serverUUID=cf020fcf-d3aa-11e8-ac61-54524514640f], 
> temp=false]@6425eba3, filter=null, name=nonDurable.MY_SUB, 
> clusterName=nonDurable.MY_SUBcf020fcf-d3aa-11e8-ac61-54524514640f]]
>   at 
> org.apache.activemq.artemis.core.postoffice.impl.SimpleAddressManager.addBinding(SimpleAddressManager.java:85)
>   at 
> org.apache.activemq.artemis.core.postoffice.impl.WildcardAddressManager.addBinding(WildcardAddressManager.java:90)
>   at 
> org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.addBinding(PostOfficeImpl.java:615)
>   at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:2818)
>   at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:1690)
>   at 
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:594)
>   at 
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:634)
>   at 
> org.apache.activemq.artemis.protocol.amqp.broker.AMQPSessionCallback.createSharedVolatileQueue(AMQPSessionCallback.java:283)
>   at 
> org.apache.activemq.artemis.protocol.amqp.proton.ProtonServerSenderContext.initialise(ProtonServerSenderContext.java:374)
>   at 
> org.apache.activemq.artemis.protocol.amqp.proton.AMQPSessionContext.addSender(AMQPSessionContext.java:168)
>   at 
> org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext.remoteLinkOpened(AMQPConnectionContext.java:243)
>   at 
> org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext.onRemoteOpen(AMQPConnectionContext.java:462)
>   at 
> org.apache.activemq.artemis.protocol.amqp.proton.handler.Events.dispatch(Events.java:68)
>   at 
> org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.dispatch(ProtonHandler.java:494)
>   at 
> org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.flush(ProtonHandler.java:307)
>   at 
> org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.inputBuffer(ProtonHandler.java:272)
>   at 
> org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext.inputBuffer(AMQPConnectionContext.java:158)
>   at 
> org.apache.activemq.artemis.protocol.amqp.broker.ActiveMQProtonRemotingConnection.bufferReceived(ActiveMQProtonRemotingConnection.java:147)
>   at 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$DelegatingBufferHandler.bufferReceived(RemotingServiceImpl.java:643)
>   at 
> org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.channelRead(ActiveMQChannelHandler.java:73)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
>   at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
>   at 
> 

[jira] [Created] (ARTEMIS-2140) AMQP Shared Subscriptions fail because of RaceCondition

2018-10-19 Thread Johan Stenberg (JIRA)
Johan Stenberg created ARTEMIS-2140:
---

 Summary: AMQP Shared Subscriptions fail because of RaceCondition
 Key: ARTEMIS-2140
 URL: https://issues.apache.org/jira/browse/ARTEMIS-2140
 Project: ActiveMQ Artemis
  Issue Type: Bug
  Components: AMQP, Broker
Affects Versions: 2.6.3
Reporter: Johan Stenberg


When multiple clients try to subscribe to the same shared subscription the 
following exception occurs:
{noformat}
Okt 19, 2018 4:25:21 PM 
org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler dispatch
WARN: AMQ119018: Binding already exists LocalQueueBinding [address=topics.cat, 
queue=QueueImpl[name=nonDurable.MY_SUB, postOffice=PostOfficeImpl 
[server=ActiveMQServerImpl::serverUUID=cf020fcf-d3aa-11e8-ac61-54524514640f], 
temp=false]@6425eba3, filter=null, name=nonDurable.MY_SUB, 
clusterName=nonDurable.MY_SUBcf020fcf-d3aa-11e8-ac61-54524514640f]
ActiveMQQueueExistsException[errorType=QUEUE_EXISTS message=AMQ119018: Binding 
already exists LocalQueueBinding [address=topics.cat, 
queue=QueueImpl[name=nonDurable.MY_SUB, postOffice=PostOfficeImpl 
[server=ActiveMQServerImpl::serverUUID=cf020fcf-d3aa-11e8-ac61-54524514640f], 
temp=false]@6425eba3, filter=null, name=nonDurable.MY_SUB, 
clusterName=nonDurable.MY_SUBcf020fcf-d3aa-11e8-ac61-54524514640f]]
at 
org.apache.activemq.artemis.core.postoffice.impl.SimpleAddressManager.addBinding(SimpleAddressManager.java:85)
at 
org.apache.activemq.artemis.core.postoffice.impl.WildcardAddressManager.addBinding(WildcardAddressManager.java:90)
at 
org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.addBinding(PostOfficeImpl.java:615)
at 
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:2818)
at 
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:1690)
at 
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:594)
at 
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:634)
at 
org.apache.activemq.artemis.protocol.amqp.broker.AMQPSessionCallback.createSharedVolatileQueue(AMQPSessionCallback.java:283)
at 
org.apache.activemq.artemis.protocol.amqp.proton.ProtonServerSenderContext.initialise(ProtonServerSenderContext.java:374)
at 
org.apache.activemq.artemis.protocol.amqp.proton.AMQPSessionContext.addSender(AMQPSessionContext.java:168)
at 
org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext.remoteLinkOpened(AMQPConnectionContext.java:243)
at 
org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext.onRemoteOpen(AMQPConnectionContext.java:462)
at 
org.apache.activemq.artemis.protocol.amqp.proton.handler.Events.dispatch(Events.java:68)
at 
org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.dispatch(ProtonHandler.java:494)
at 
org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.flush(ProtonHandler.java:307)
at 
org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.inputBuffer(ProtonHandler.java:272)
at 
org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext.inputBuffer(AMQPConnectionContext.java:158)
at 
org.apache.activemq.artemis.protocol.amqp.broker.ActiveMQProtonRemotingConnection.bufferReceived(ActiveMQProtonRemotingConnection.java:147)
at 
org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$DelegatingBufferHandler.bufferReceived(RemotingServiceImpl.java:643)
at 
org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.channelRead(ActiveMQChannelHandler.java:73)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
 

[jira] [Commented] (ARTEMIS-2096) AMQP: Refactoring AMQPMessage abstraction for better consistency and performance

2018-10-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656843#comment-16656843
 ] 

ASF GitHub Bot commented on ARTEMIS-2096:
-

Github user gaohoward commented on the issue:

https://github.com/apache/activemq-artemis/pull/2383
  
agreed I'll remove the NO-JIRA. thx.


> AMQP: Refactoring AMQPMessage abstraction for better consistency and 
> performance
> 
>
> Key: ARTEMIS-2096
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2096
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: AMQP
>Affects Versions: 2.6.3
>Reporter: Timothy Bish
>Assignee: Timothy Bish
>Priority: Major
> Fix For: 2.7.0
>
>
> The AMQPMessage abstraction used to wrap the AMQP message section has some 
> inconsistencies in how it manages the underlying data and the decoded AMQP 
> section obtained from the Proton-J codec as well as issues with state being 
> maintained in the presence of changes to the message made through the public 
> facing Message APIs
> A refactoring of the AMQPMessage class to better utilize the proton-j codec 
> to manage the message data and how it is parsed and re-encoded on change 
> needs to be done to ensure no corrupt messages are sent and that we are not 
> decoding and encoding sections of the message we are not intending to read or 
> change on the sever (We currently can decode message bodies or footer is a 
> few cases where we intend not to).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARTEMIS-2096) AMQP: Refactoring AMQPMessage abstraction for better consistency and performance

2018-10-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656838#comment-16656838
 ] 

ASF GitHub Bot commented on ARTEMIS-2096:
-

Github user tabish121 commented on the issue:

https://github.com/apache/activemq-artemis/pull/2383
  
This isn't really a NO-JIRA commit if you are adding a test against a 
specific JIRA is it?  No reason I can see to put a NO-JIRA on this instead of 
just tagging against the JIRA that it is related to.  


> AMQP: Refactoring AMQPMessage abstraction for better consistency and 
> performance
> 
>
> Key: ARTEMIS-2096
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2096
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: AMQP
>Affects Versions: 2.6.3
>Reporter: Timothy Bish
>Assignee: Timothy Bish
>Priority: Major
> Fix For: 2.7.0
>
>
> The AMQPMessage abstraction used to wrap the AMQP message section has some 
> inconsistencies in how it manages the underlying data and the decoded AMQP 
> section obtained from the Proton-J codec as well as issues with state being 
> maintained in the presence of changes to the message made through the public 
> facing Message APIs
> A refactoring of the AMQPMessage class to better utilize the proton-j codec 
> to manage the message data and how it is parsed and re-encoded on change 
> needs to be done to ensure no corrupt messages are sent and that we are not 
> decoding and encoding sections of the message we are not intending to read or 
> change on the sever (We currently can decode message bodies or footer is a 
> few cases where we intend not to).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Gary Tully (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Tully closed AMQ-7081.
---

> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Reporter: Simon Lundstrom
>Assignee: Gary Tully
>Priority: Major
> Attachments: AMQ7081Test.java
>
>
> The fix of AMQ-7079 introduced a breaking change bug since the default value 
> of {{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} 
> to just configure the slow consumer detection but it started to disconnect 
> the consumer as well.\{{}}
> Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
> don't think we should change the old default behavior.
> Pre AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> worked before in just detecting a slow consumer. consumer was *not* 
> disconnected.
> After AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> disconnects the consumer; ActiveMQ logs:
> {code:java}
> 2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
> destination:queue://su.it.linfra.simlu | 
> org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
> Broker[localhost] Scheduler
> 2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> 2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
> java.lang.IllegalStateException: Cannot remove a consumer that had not been 
> registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
> org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> {code}
> Spring Boot logs:
> {code:java}
> 2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
> se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
> running for 2.386)
> 2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
> org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
> 2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
> se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
> here for the message body...
> 2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
> invoker failed for destination 'su.it.linfra.simlu' - trying to recover. 
> Cause: The Consumer is closed
> 2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS 
> Connection
> {code}
>  
> The order ("Consumer closed" before "Message Received") is weird because I 
> just use a simple Thread.sleep I suspect:
> {code:java}
>   @Transactional
>   @JmsListener(destination = "su.it.linfra.simlu")
>   public void receiveQueue(String text) throws Exception {
> Thread.sleep(5);
> log.info("Message Received: "+text);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Gary Tully (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Tully resolved AMQ-7081.
-
Resolution: Not A Bug

> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Reporter: Simon Lundstrom
>Assignee: Gary Tully
>Priority: Major
> Attachments: AMQ7081Test.java
>
>
> The fix of AMQ-7079 introduced a breaking change bug since the default value 
> of {{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} 
> to just configure the slow consumer detection but it started to disconnect 
> the consumer as well.\{{}}
> Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
> don't think we should change the old default behavior.
> Pre AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> worked before in just detecting a slow consumer. consumer was *not* 
> disconnected.
> After AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> disconnects the consumer; ActiveMQ logs:
> {code:java}
> 2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
> destination:queue://su.it.linfra.simlu | 
> org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
> Broker[localhost] Scheduler
> 2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> 2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
> java.lang.IllegalStateException: Cannot remove a consumer that had not been 
> registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
> org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> {code}
> Spring Boot logs:
> {code:java}
> 2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
> se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
> running for 2.386)
> 2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
> org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
> 2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
> se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
> here for the message body...
> 2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
> invoker failed for destination 'su.it.linfra.simlu' - trying to recover. 
> Cause: The Consumer is closed
> 2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS 
> Connection
> {code}
>  
> The order ("Consumer closed" before "Message Received") is weird because I 
> just use a simple Thread.sleep I suspect:
> {code:java}
>   @Transactional
>   @JmsListener(destination = "su.it.linfra.simlu")
>   public void receiveQueue(String text) throws Exception {
> Thread.sleep(5);
> log.info("Message Received: "+text);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARTEMIS-2096) AMQP: Refactoring AMQPMessage abstraction for better consistency and performance

2018-10-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656824#comment-16656824
 ] 

ASF GitHub Bot commented on ARTEMIS-2096:
-

GitHub user gaohoward opened a pull request:

https://github.com/apache/activemq-artemis/pull/2383

NO-JIRA Adding a test for ARTEMIS-2096

This test can verify an issue fixed by the commit:
7a463f038ae324f2c5c908321b2ebf03b5a8e303 (ARTEMIS-2096)
The issue was reported in:
https://issues.jboss.org/browse/ENTMQBR-2034
but not reported in Artemis.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gaohoward/activemq-artemis c_2034

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/activemq-artemis/pull/2383.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2383


commit 89ef072d80d6d034be73244af05617b2bf3baf36
Author: Howard Gao 
Date:   2018-10-19T13:40:33Z

NO-JIRA Adding a test for ARTEMIS-2096

This test can verify an issue fixed by the commit:
7a463f038ae324f2c5c908321b2ebf03b5a8e303 (ARTEMIS-2096)
The issue was reported in:
https://issues.jboss.org/browse/ENTMQBR-2034
but not reported in Artemis.




> AMQP: Refactoring AMQPMessage abstraction for better consistency and 
> performance
> 
>
> Key: ARTEMIS-2096
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2096
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: AMQP
>Affects Versions: 2.6.3
>Reporter: Timothy Bish
>Assignee: Timothy Bish
>Priority: Major
> Fix For: 2.7.0
>
>
> The AMQPMessage abstraction used to wrap the AMQP message section has some 
> inconsistencies in how it manages the underlying data and the decoded AMQP 
> section obtained from the Proton-J codec as well as issues with state being 
> maintained in the presence of changes to the message made through the public 
> facing Message APIs
> A refactoring of the AMQPMessage class to better utilize the proton-j codec 
> to manage the message data and how it is parsed and re-encoded on change 
> needs to be done to ensure no corrupt messages are sent and that we are not 
> decoding and encoding sections of the message we are not intending to read or 
> change on the sever (We currently can decode message bodies or footer is a 
> few cases where we intend not to).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7068) Advisory messages are empty when received with a AMQP subscription

2018-10-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656822#comment-16656822
 ] 

ASF GitHub Bot commented on AMQ-7068:
-

GitHub user JohBa opened a pull request:

https://github.com/apache/activemq/pull/312

[AMQ-7068] Advisory messages are empty when received with a AMQP 
subscription

This pull request does not tackle the whole problem, but part of it.
Maps `ConnectionInfo` an parts of `RemoveInfo` advisory messages for usage 
with AMQP.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohBa/activemq master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/activemq/pull/312.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #312


commit 8c7e320dca569bdcf2157d9b248d8223f2f6a750
Author: Johannes Bäurle 
Date:   2018-10-18T12:27:07Z

Partly mapped ConnectionInfo advisory message for AMQP

commit ab36bd4ab514a72053ad695cfe2082a097c94f15
Author: Johannes Bäurle 
Date:   2018-10-18T12:31:16Z

Merge remote-tracking branch 'upstream/master'

commit 182fbab114a53d4ae2b9c7a6436e7a67794e4795
Author: Johannes Bäurle 
Date:   2018-10-18T12:33:14Z

Mapping for removeInfo with connectionId for AMQP advisory message

commit ef34fb1c5872e2b8dc4ecc19e87e9737cd74c331
Author: Johannes Bäurle 
Date:   2018-10-18T12:36:26Z

Add Tests

commit 4ab7fd0aac52ca3e2f37ecc4bd7487eb93120ec5
Author: Johannes Bäurle 
Date:   2018-10-19T13:38:53Z

adjust tests




> Advisory messages are empty when received with a AMQP subscription
> --
>
> Key: AMQ-7068
> URL: https://issues.apache.org/jira/browse/AMQ-7068
> Project: ActiveMQ
>  Issue Type: New Feature
>  Components: AMQP, Transport
>Affects Versions: 5.15.6
> Environment: ActiveMQ 5.15.6, JDK 1.8.0_161, Windows 10, AMQPNetLite 
> 2.1.4
>Reporter: Johannes Baeurle
>Priority: Minor
> Attachments: issueamq.PNG, issueamq2.PNG, issueamq3.PNG
>
>
> We are currently moving from OpenWire to AMQP with .NET Library amqpnetlite 
> ([https://github.com/Azure/amqpnetlite)] to communicate. So far most things 
> work fine, but we actively used and need the advisory messages in order to 
> recognize if other clients disconnect for example.
> Accessing the advisory messages through the topic is not the problem, but the 
> body is null for the ActiveMQ.Advisory.Connection topic. There are some 
> properties set, but no body set and I'm not able to find any important 
> information, like the RemoveInfo. I attached a few screenshots from debugger.
> To be honest, I don't know if this is the desired behavior, but I think if 
> there are messages on the advisory topic they should be useful.
> I know that a byte representation wouldn't be that useful, but you could 
> present the information in json or xml format, like in Stomp? 
> (https://issues.apache.org/jira/browse/AMQ-2098)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Simon Lundstrom (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656810#comment-16656810
 ] 

Simon Lundstrom commented on AMQ-7081:
--

Was fighting configuring the log level when running tests for a while but I got 
it running finally and the test passes.

Yes, it works as intended and I was wrong but I can't figure out how I could 
get it to work before. Oh well! We're looking forward to upgrading to the next 
release and be able alert on dead/hung slowConsumers now.

Thanks again! Have a great weekend!

> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Reporter: Simon Lundstrom
>Assignee: Gary Tully
>Priority: Major
> Attachments: AMQ7081Test.java
>
>
> The fix of AMQ-7079 introduced a breaking change bug since the default value 
> of {{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} 
> to just configure the slow consumer detection but it started to disconnect 
> the consumer as well.\{{}}
> Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
> don't think we should change the old default behavior.
> Pre AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> worked before in just detecting a slow consumer. consumer was *not* 
> disconnected.
> After AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> disconnects the consumer; ActiveMQ logs:
> {code:java}
> 2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
> destination:queue://su.it.linfra.simlu | 
> org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
> Broker[localhost] Scheduler
> 2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> 2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
> java.lang.IllegalStateException: Cannot remove a consumer that had not been 
> registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
> org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> {code}
> Spring Boot logs:
> {code:java}
> 2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
> se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
> running for 2.386)
> 2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
> org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
> 2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
> se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
> here for the message body...
> 2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
> invoker failed for destination 'su.it.linfra.simlu' - trying to recover. 
> Cause: The Consumer is closed
> 2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS 
> Connection
> {code}
>  
> The order ("Consumer closed" before "Message Received") is weird because I 
> just use a simple Thread.sleep I suspect:
> {code:java}
>   @Transactional
>   @JmsListener(destination = "su.it.linfra.simlu")
>   public void receiveQueue(String text) throws Exception {
> Thread.sleep(5);
> log.info("Message Received: "+text);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-19 Thread Gary Tully (JIRA)
Gary Tully created AMQ-7082:
---

 Summary: KahaDB index, recover free pages in parallel with start
 Key: AMQ-7082
 URL: https://issues.apache.org/jira/browse/AMQ-7082
 Project: ActiveMQ
  Issue Type: Bug
  Components: KahaDB
Affects Versions: 5.15.0
Reporter: Gary Tully
Assignee: Gary Tully
 Fix For: 5.16.0


AMQ-6590 fixes free page loss through recovery. The recover process can be 
timely, which prevents fast failover, doing recovery on shutdown is preferable, 
but it is still not ideal b/c it will hold onto the kahadb lock. It also can 
stall shutdown unexpectedly.

AMQ-7080 is going to tackle checkpointing the free list. This should help avoid 
the need for recovery but it may still be necessary. If the perf hit is 
significant this may need to be optional.

There will still be the need to walk the index to find the free list.

It is possible to run with no free list and grow, and we can do that while we 
recover the free list in parallel, then merge the two at a safe point. This we 
can do at startup.

In cases where the disk is the bottleneck this won't help much, but it will 
help failover and it will help shutdown, with a bit of luck the recovery will 
complete before we stop.

 

Initially I thought this would be too complex, but if we concede some growth 
while we recover, ie: start with an empty free list, it is should be straight 
forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2018-10-19 Thread Jeff Genender (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656744#comment-16656744
 ] 

Jeff Genender commented on AMQ-7080:


[~gtully] Yes it is fundamental so that definitely makes sense.  Can you please 
give your thoughts on my comment above about the ACTIVEMQ_KILL_MAXSECONDS?  
Thanks.

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Priority: Major
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Simon Lundstrom (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656704#comment-16656704
 ] 

Simon Lundstrom commented on AMQ-7081:
--

Testing it out, hold on.

> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Reporter: Simon Lundstrom
>Assignee: Gary Tully
>Priority: Major
> Attachments: AMQ7081Test.java
>
>
> The fix of AMQ-7079 introduced a breaking change bug since the default value 
> of {{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} 
> to just configure the slow consumer detection but it started to disconnect 
> the consumer as well.\{{}}
> Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
> don't think we should change the old default behavior.
> Pre AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> worked before in just detecting a slow consumer. consumer was *not* 
> disconnected.
> After AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> disconnects the consumer; ActiveMQ logs:
> {code:java}
> 2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
> destination:queue://su.it.linfra.simlu | 
> org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
> Broker[localhost] Scheduler
> 2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> 2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
> java.lang.IllegalStateException: Cannot remove a consumer that had not been 
> registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
> org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> {code}
> Spring Boot logs:
> {code:java}
> 2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
> se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
> running for 2.386)
> 2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
> org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
> 2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
> se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
> here for the message body...
> 2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
> invoker failed for destination 'su.it.linfra.simlu' - trying to recover. 
> Cause: The Consumer is closed
> 2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS 
> Connection
> {code}
>  
> The order ("Consumer closed" before "Message Received") is weird because I 
> just use a simple Thread.sleep I suspect:
> {code:java}
>   @Transactional
>   @JmsListener(destination = "su.it.linfra.simlu")
>   public void receiveQueue(String text) throws Exception {
> Thread.sleep(5);
> log.info("Message Received: "+text);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Gary Tully (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656669#comment-16656669
 ] 

Gary Tully edited comment on AMQ-7081 at 10/19/18 11:17 AM:


see the attached test case, leave it running for 30 seconds and note the TRACE 
logging.

 

https://issues.apache.org/jira/secure/attachment/12944710/AMQ7081Test.java


was (Author: gtully):
see the attached test case, leave it running for 30 seconds and note the TRACE 
logging.

> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Reporter: Simon Lundstrom
>Assignee: Gary Tully
>Priority: Major
> Attachments: AMQ7081Test.java
>
>
> The fix of AMQ-7079 introduced a breaking change bug since the default value 
> of {{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} 
> to just configure the slow consumer detection but it started to disconnect 
> the consumer as well.\{{}}
> Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
> don't think we should change the old default behavior.
> Pre AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> worked before in just detecting a slow consumer. consumer was *not* 
> disconnected.
> After AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> disconnects the consumer; ActiveMQ logs:
> {code:java}
> 2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
> destination:queue://su.it.linfra.simlu | 
> org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
> Broker[localhost] Scheduler
> 2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> 2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
> java.lang.IllegalStateException: Cannot remove a consumer that had not been 
> registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
> org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> {code}
> Spring Boot logs:
> {code:java}
> 2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
> se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
> running for 2.386)
> 2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
> org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
> 2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
> se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
> here for the message body...
> 2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
> invoker failed for destination 'su.it.linfra.simlu' - trying to recover. 
> Cause: The Consumer is closed
> 2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS 
> Connection
> {code}
>  
> The order ("Consumer closed" before "Message Received") is weird because I 
> just use a simple Thread.sleep I suspect:
> {code:java}
>   @Transactional
>   @JmsListener(destination = "su.it.linfra.simlu")
>   public void receiveQueue(String text) throws Exception {
> Thread.sleep(5);
> log.info("Message Received: "+text);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Gary Tully (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656669#comment-16656669
 ] 

Gary Tully commented on AMQ-7081:
-

see the attached test case, leave it running for 30 seconds and note the TRACE 
logging.

> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Reporter: Simon Lundstrom
>Assignee: Gary Tully
>Priority: Major
> Attachments: AMQ7081Test.java
>
>
> The fix of AMQ-7079 introduced a breaking change bug since the default value 
> of {{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} 
> to just configure the slow consumer detection but it started to disconnect 
> the consumer as well.\{{}}
> Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
> don't think we should change the old default behavior.
> Pre AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> worked before in just detecting a slow consumer. consumer was *not* 
> disconnected.
> After AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> disconnects the consumer; ActiveMQ logs:
> {code:java}
> 2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
> destination:queue://su.it.linfra.simlu | 
> org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
> Broker[localhost] Scheduler
> 2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> 2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
> java.lang.IllegalStateException: Cannot remove a consumer that had not been 
> registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
> org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> {code}
> Spring Boot logs:
> {code:java}
> 2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
> se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
> running for 2.386)
> 2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
> org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
> 2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
> se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
> here for the message body...
> 2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
> invoker failed for destination 'su.it.linfra.simlu' - trying to recover. 
> Cause: The Consumer is closed
> 2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS 
> Connection
> {code}
>  
> The order ("Consumer closed" before "Message Received") is weird because I 
> just use a simple Thread.sleep I suspect:
> {code:java}
>   @Transactional
>   @JmsListener(destination = "su.it.linfra.simlu")
>   public void receiveQueue(String text) throws Exception {
> Thread.sleep(5);
> log.info("Message Received: "+text);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Gary Tully (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Tully updated AMQ-7081:

Attachment: AMQ7081Test.java

> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Reporter: Simon Lundstrom
>Assignee: Gary Tully
>Priority: Major
> Attachments: AMQ7081Test.java
>
>
> The fix of AMQ-7079 introduced a breaking change bug since the default value 
> of {{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} 
> to just configure the slow consumer detection but it started to disconnect 
> the consumer as well.\{{}}
> Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
> don't think we should change the old default behavior.
> Pre AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> worked before in just detecting a slow consumer. consumer was *not* 
> disconnected.
> After AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> disconnects the consumer; ActiveMQ logs:
> {code:java}
> 2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
> destination:queue://su.it.linfra.simlu | 
> org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
> Broker[localhost] Scheduler
> 2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> 2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
> java.lang.IllegalStateException: Cannot remove a consumer that had not been 
> registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
> org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> {code}
> Spring Boot logs:
> {code:java}
> 2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
> se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
> running for 2.386)
> 2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
> org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
> 2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
> se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
> here for the message body...
> 2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
> invoker failed for destination 'su.it.linfra.simlu' - trying to recover. 
> Cause: The Consumer is closed
> 2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS 
> Connection
> {code}
>  
> The order ("Consumer closed" before "Message Received") is weird because I 
> just use a simple Thread.sleep I suspect:
> {code:java}
>   @Transactional
>   @JmsListener(destination = "su.it.linfra.simlu")
>   public void receiveQueue(String text) throws Exception {
> Thread.sleep(5);
> log.info("Message Received: "+text);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Gary Tully (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656663#comment-16656663
 ] 

Gary Tully commented on AMQ-7081:
-

This looks ok, maxSlowDuration defaults to 30 seconds, if it is slow for that 
long it gets kicked with that config.

> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Reporter: Simon Lundstrom
>Assignee: Gary Tully
>Priority: Major
>
> The fix of AMQ-7079 introduced a breaking change bug since the default value 
> of {{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} 
> to just configure the slow consumer detection but it started to disconnect 
> the consumer as well.\{{}}
> Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
> don't think we should change the old default behavior.
> Pre AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> worked before in just detecting a slow consumer. consumer was *not* 
> disconnected.
> After AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> disconnects the consumer; ActiveMQ logs:
> {code:java}
> 2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
> destination:queue://su.it.linfra.simlu | 
> org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
> Broker[localhost] Scheduler
> 2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> 2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
> java.lang.IllegalStateException: Cannot remove a consumer that had not been 
> registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
> org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> {code}
> Spring Boot logs:
> {code:java}
> 2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
> se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
> running for 2.386)
> 2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
> org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
> 2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
> se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
> here for the message body...
> 2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
> invoker failed for destination 'su.it.linfra.simlu' - trying to recover. 
> Cause: The Consumer is closed
> 2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS 
> Connection
> {code}
>  
> The order ("Consumer closed" before "Message Received") is weird because I 
> just use a simple Thread.sleep I suspect:
> {code:java}
>   @Transactional
>   @JmsListener(destination = "su.it.linfra.simlu")
>   public void receiveQueue(String text) throws Exception {
> Thread.sleep(5);
> log.info("Message Received: "+text);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARTEMIS-2139) Message sent to JMSReplyTo from old client does not find correct bindings

2018-10-19 Thread Martyn Taylor (JIRA)
Martyn Taylor created ARTEMIS-2139:
--

 Summary: Message sent to JMSReplyTo from old client does not find 
correct bindings
 Key: ARTEMIS-2139
 URL: https://issues.apache.org/jira/browse/ARTEMIS-2139
 Project: ActiveMQ Artemis
  Issue Type: Bug
Reporter: Martyn Taylor


JMSReplyTo destination set by older client contains incorrect address which 
causes that reply message does not have correct binding and such message is 
lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARTEMIS-2139) Message sent to JMSReplyTo from old client does not find correct bindings

2018-10-19 Thread Martyn Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARTEMIS-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martyn Taylor reassigned ARTEMIS-2139:
--

Assignee: Martyn Taylor

> Message sent to JMSReplyTo from old client does not find correct bindings
> -
>
> Key: ARTEMIS-2139
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2139
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>Reporter: Martyn Taylor
>Assignee: Martyn Taylor
>Priority: Major
>
> JMSReplyTo destination set by older client contains incorrect address which 
> causes that reply message does not have correct binding and such message is 
> lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Gary Tully (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Tully reassigned AMQ-7081:
---

Assignee: Gary Tully

> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: Broker
>Reporter: Simon Lundstrom
>Assignee: Gary Tully
>Priority: Major
>
> The fix of AMQ-7079 introduced a breaking change bug since the default value 
> of {{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} 
> to just configure the slow consumer detection but it started to disconnect 
> the consumer as well.\{{}}
> Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
> don't think we should change the old default behavior.
> Pre AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> worked before in just detecting a slow consumer. consumer was *not* 
> disconnected.
> After AMQ-7079 fix:
> {code:xml}
>  maxSlowCount="-1" />
> {code}
> disconnects the consumer; ActiveMQ logs:
> {code:java}
> 2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
> destination:queue://su.it.linfra.simlu | 
> org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
> Broker[localhost] Scheduler
> 2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> 2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
> java.lang.IllegalStateException: Cannot remove a consumer that had not been 
> registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
> org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
> tcp:///127.0.0.1:53365@61616
> {code}
> Spring Boot logs:
> {code:java}
> 2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
> se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
> running for 2.386)
> 2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
> org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
> ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
> 2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
> se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
> here for the message body...
> 2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
> invoker failed for destination 'su.it.linfra.simlu' - trying to recover. 
> Cause: The Consumer is closed
> 2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
> o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS 
> Connection
> {code}
>  
> The order ("Consumer closed" before "Message Received") is weird because I 
> just use a simple Thread.sleep I suspect:
> {code:java}
>   @Transactional
>   @JmsListener(destination = "su.it.linfra.simlu")
>   public void receiveQueue(String text) throws Exception {
> Thread.sleep(5);
> log.info("Message Received: "+text);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2018-10-19 Thread Gary Tully (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656627#comment-16656627
 ] 

Gary Tully commented on AMQ-7080:
-

It is more to verify the read. A partial write will be ok in this case, but any 
corruption or truncation and it may be possible to read the fingerprint and bad 
data. The freelist is fundamental, if that is wrong then the index gets borked 
and the journal needs to be replayed. At the moment, the index is self 
contained, it will now depend on another file, we need to be sure we can trust 
the content of same.

in addition, it need not change the free list format.

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Priority: Major
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARTEMIS-2138) [AMQP] Creating Shared Subscriptions with the same name on different queues fails.

2018-10-19 Thread Robbie Gemmell (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARTEMIS-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Gemmell resolved ARTEMIS-2138.
-
Resolution: Invalid

>From what I see the test actually should fail. Is there a particular reason 
>you expected it to work?

You don't have a ClientID set on the Connections, so their shared subscriptions 
are shared among all such connections without a ClientID set. They are using 
the same subscription namespace, so they cant have two different shared 
subscriptions with the same name at the same time.

If you set a ClientID, then sharing is only on that ClientID/Connection as the 
subscription names are scoped to the ClientID, so two different Connections can 
have a subscription with the same name at the same time.

> [AMQP] Creating Shared Subscriptions with the same name on different queues 
> fails.
> --
>
> Key: ARTEMIS-2138
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2138
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: AMQP, Broker
>Affects Versions: 2.6.3
>Reporter: Johan Stenberg
>Priority: Major
> Attachments: Artemis2138_AmqpSharedConsumerTest.java
>
>
> The following exception occurs when e.g. two clients try to create shared 
> subscription on different topics with the same subscription name.
> {noformat}
> ActiveMQQueueExistsException[errorType=QUEUE_EXISTS message=AMQ119019: Queue 
> foo:shared-volatile:global already exists on address topics.cats]
>   at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:2763)
>   at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:1690)
>   at 
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:594)
>   at 
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:634)
> {noformat}
> For example
> {code:java}
> // client 1 subscribes to topics.cats with subscription name "foo"
> Session jmsSess1 = jmsClient1.createSession();
> Topic jmsTopicCats = jmsSess1.createTopic("topics.cats");
> jmsSess1.createSharedConsumer(jmsTopicCats , "foo");
> // client 2 subscribes to topics.dogs with subscription name "foo"
> Session jmsSess2 = jmsClient2.createSession();
> Topic jmsTopicDogs = jmsSess2.createTopic("topics.dogs");
> jmsSess2.createSharedConsumer(jmsTopicDogs, "foo"); // this fails
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARTEMIS-2138) [AMQP] Creating Shared Subscriptions with the same name on different queues fails.

2018-10-19 Thread Johan Stenberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARTEMIS-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Stenberg updated ARTEMIS-2138:

Summary: [AMQP] Creating Shared Subscriptions with the same name on 
different queues fails.  (was: Creating Shared Subscriptions with the same name 
on different queues fails.)

> [AMQP] Creating Shared Subscriptions with the same name on different queues 
> fails.
> --
>
> Key: ARTEMIS-2138
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2138
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: AMQP, Broker
>Affects Versions: 2.6.3
>Reporter: Johan Stenberg
>Priority: Major
> Attachments: Artemis2138_AmqpSharedConsumerTest.java
>
>
> The following exception occurs when e.g. two clients try to create shared 
> subscription on different topics with the same subscription name.
> {noformat}
> ActiveMQQueueExistsException[errorType=QUEUE_EXISTS message=AMQ119019: Queue 
> foo:shared-volatile:global already exists on address topics.cats]
>   at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:2763)
>   at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:1690)
>   at 
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:594)
>   at 
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:634)
> {noformat}
> For example
> {code:java}
> // client 1 subscribes to topics.cats with subscription name "foo"
> Session jmsSess1 = jmsClient1.createSession();
> Topic jmsTopicCats = jmsSess1.createTopic("topics.cats");
> jmsSess1.createSharedConsumer(jmsTopicCats , "foo");
> // client 2 subscribes to topics.dogs with subscription name "foo"
> Session jmsSess2 = jmsClient2.createSession();
> Topic jmsTopicDogs = jmsSess2.createTopic("topics.dogs");
> jmsSess2.createSharedConsumer(jmsTopicDogs, "foo"); // this fails
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARTEMIS-2138) Creating Shared Subscriptions with the same name on different queues fails.

2018-10-19 Thread Johan Stenberg (JIRA)
Johan Stenberg created ARTEMIS-2138:
---

 Summary: Creating Shared Subscriptions with the same name on 
different queues fails.
 Key: ARTEMIS-2138
 URL: https://issues.apache.org/jira/browse/ARTEMIS-2138
 Project: ActiveMQ Artemis
  Issue Type: Bug
  Components: AMQP, Broker
Affects Versions: 2.6.3
Reporter: Johan Stenberg


The following exception occurs when e.g. two clients try to create shared 
subscription on different topics with the same subscription name.
{noformat}
ActiveMQQueueExistsException[errorType=QUEUE_EXISTS message=AMQ119019: Queue 
foo:shared-volatile:global already exists on address topics.cats]
at 
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:2763)
at 
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.createQueue(ActiveMQServerImpl.java:1690)
at 
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:594)
at 
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.createQueue(ServerSessionImpl.java:634)
{noformat}
For example
{code:java}
// client 1 subscribes to topics.cats with subscription name "foo"
Session jmsSess1 = jmsClient1.createSession();
Topic jmsTopicCats = jmsSess1.createTopic("topics.cats");
jmsSess1.createSharedConsumer(jmsTopicCats , "foo");

// client 2 subscribes to topics.dogs with subscription name "foo"
Session jmsSess2 = jmsClient2.createSession();
Topic jmsTopicDogs = jmsSess2.createTopic("topics.dogs");
jmsSess2.createSharedConsumer(jmsTopicDogs, "foo"); // this fails
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARTEMIS-2007) Messages not forwaded/redistributed to cluster nodes with matching consumers

2018-10-19 Thread Sebastian (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARTEMIS-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian updated ARTEMIS-2007:
---
Affects Version/s: (was: 2.6.2)
   2.6.3

> Messages not forwaded/redistributed to cluster nodes with matching consumers
> 
>
> Key: ARTEMIS-2007
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2007
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: AMQP, Broker
>Affects Versions: 2.6.3
>Reporter: Sebastian
>Priority: Critical
> Attachments: artemis-2007.zip
>
>
> We are experiencing the following issue:
> # We configure an Artemis cluster with ON_DEMAND message loadbalacing and 
> message redistribution enabled.
> # We then connect a single consumer to *queues.queue1* on node1 that has a 
> message filter that does NOT match a given message.
> # Then we send a message to *queues.queue1* on node1.
> # Then we connect a consumer to *queues.queue1* on node2 that has a filter 
> matching the message we sent.
> We now would expect that the message on node1 currently not having any 
> matching consumers on node1 to be forwarded or redistributed to node2 where a 
> matching consumer exists.
> However that is not happening the consumer on node2 does not receive the 
> message and in our case the message on node1 expires after some time despite 
> a matching consumer is connected to the cluster.
> In the described scenario when we disconnect the consumer on node1 (that does 
> not match the message anyway) the message is redistributed to node2 and 
> consumed by the matching consumer.
> If no consumer was connected to node1, a message is sent to node1 and only 
> then a matching consumer is connected to node2 the message is forwarded to 
> node2 as expected.
> So I guess the core problem is that message redistribution of messages on 
> node1 is not triggered when a matching consumer is connected to node2 while a 
> *any* consumer already exists on node1 no matter if it actually matches the 
> given message.
> I attached a maven test case that illustrates the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Simon Lundstrom (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Lundstrom updated AMQ-7081:
-
Description: 
The fix of AMQ-7079 introduced a breaking change bug since the default value of 
{{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} to 
just configure the slow consumer detection but it started to disconnect the 
consumer as well.\{{}}

Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
don't think we should change the old default behavior.

Pre AMQ-7079 fix:
{code:xml}

{code}
worked before in just detecting a slow consumer. consumer was *not* 
disconnected.

After AMQ-7079 fix:
{code:xml}

{code}
disconnects the consumer; ActiveMQ logs:
{code:java}
2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
destination:queue://su.it.linfra.simlu | 
org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
Broker[localhost] Scheduler
2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
java.lang.IllegalStateException: Cannot remove a consumer that had not been 
registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
{code}
Spring Boot logs:
{code:java}
2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
running for 2.386)
2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
here for the message body...
2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
invoker failed for destination 'su.it.linfra.simlu' - trying to recover. Cause: 
The Consumer is closed
2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS Connection
{code}
 
The order ("Consumer closed" before "Message Received") is weird because I just 
use a simple Thread.sleep I suspect:
{code:java}
  @Transactional
  @JmsListener(destination = "su.it.linfra.simlu")
  public void receiveQueue(String text) throws Exception {
Thread.sleep(5);
log.info("Message Received: "+text);
  }
{code}

  was:
The fix of AMQ-7079 introduced a breaking change bug since the default value of 
{{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} to 
just configure the slow consumer detection but it started to disconnect the 
consumer as well.\{{}}

Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
don't think we should change the old default behavior.

Pre AMQ-7079 fix:
{code:xml}

{code}
worked before in just detecting a slow consumer. consumer was *not* 
disconnected.

After AMQ-7079 fix:
{code:xml}

{code}
disconnects the consumer; ActiveMQ logs:
{code:java}
2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
destination:queue://su.it.linfra.simlu | 
org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
Broker[localhost] Scheduler
2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
java.lang.IllegalStateException: Cannot remove a consumer that had not been 
registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
{code}
Spring Boot logs:
{code:java}
2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
running for 2.386)
2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
here for the message body...
2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
invoker failed for destination 'su.it.linfra.simlu' - trying to recover. Cause: 
The Consumer is closed
2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 

[jira] [Updated] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Simon Lundstrom (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMQ-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Lundstrom updated AMQ-7081:
-
Description: 
The fix of AMQ-7079 introduced a breaking change bug since the default value of 
{{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} to 
just configure the slow consumer detection but it started to disconnect the 
consumer as well.\{{}}

Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
don't think we should change the old default behavior.

Pre AMQ-7079 fix:
{code:xml}

{code}
worked before in just detecting a slow consumer. consumer was *not* 
disconnected.

After AMQ-7079 fix:
{code:xml}

{code}
disconnects the consumer; ActiveMQ logs:
{code:java}
2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
destination:queue://su.it.linfra.simlu | 
org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
Broker[localhost] Scheduler
2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
java.lang.IllegalStateException: Cannot remove a consumer that had not been 
registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
{code}
Spring Boot logs:
{code:java}
2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
running for 2.386)
2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
here for the message body...
2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
invoker failed for destination 'su.it.linfra.simlu' - trying to recover. Cause: 
The Consumer is closed
2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS Connection
{code}

  was:
The fix of AMQ-7079 introduced a breaking change bug since the default value of 
{{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} to 
just configure the slow consumer detection but it started to disconnect the 
consumer as well.{{}}

Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
don't think we should change the old default behavior.

Pre AMQ-7079 fix:
{code:xml}

 {code}
worked before in just detecting a slow consumer. consumer was *not* 
disconnected.

After AMQ-7079 fix:
{code:xml}

 {code}
disconnects the consumer; ActiveMQ logs:
{code}
2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
destination:queue://su.it.linfra.simlu | 
org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
Broker[localhost] Scheduler
2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
java.lang.IllegalStateException: Cannot remove a consumer that had not been 
registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
{code}
Spring Boot logs:
{code}
2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
running for 2.386)
2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
here for the message body...
2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
invoker failed for destination 'su.it.linfra.simlu' - trying to recover. Cause: 
The Consumer is closed
2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS Connection
{code}


> After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default
> 
>
> Key: AMQ-7081
> URL: https://issues.apache.org/jira/browse/AMQ-7081
>

[jira] [Created] (AMQ-7081) After AMQ-7079 abortSlowAckConsumerStrategy aborts connection by default

2018-10-19 Thread Simon Lundstrom (JIRA)
Simon Lundstrom created AMQ-7081:


 Summary: After AMQ-7079 abortSlowAckConsumerStrategy aborts 
connection by default
 Key: AMQ-7081
 URL: https://issues.apache.org/jira/browse/AMQ-7081
 Project: ActiveMQ
  Issue Type: Bug
  Components: Broker
Reporter: Simon Lundstrom


The fix of AMQ-7079 introduced a breaking change bug since the default value of 
{{maxSlowCount=-1}} was longer enough for {{abortSlowAckConsumerStrategy}} to 
just configure the slow consumer detection but it started to disconnect the 
consumer as well.{{}}

Setting {{maxSlowDuration="-1"}} doesn't disconnect the consumer though but I 
don't think we should change the old default behavior.

Pre AMQ-7079 fix:
{code:xml}

 {code}
worked before in just detecting a slow consumer. consumer was *not* 
disconnected.

After AMQ-7079 fix:
{code:xml}

 {code}
disconnects the consumer; ActiveMQ logs:
{code}
2018-10-19 10:42:33,124 | INFO  | aborting slow consumer: 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 for 
destination:queue://su.it.linfra.simlu | 
org.apache.activemq.broker.region.policy.AbortSlowConsumerStrategy | ActiveMQ 
Broker[localhost] Scheduler
2018-10-19 10:42:50,250 | WARN  | no matching consumer, ignoring ack null | 
org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
2018-10-19 10:42:50,257 | WARN  | Async error occurred: 
java.lang.IllegalStateException: Cannot remove a consumer that had not been 
registered: ID:kaka.it.su.se-53364-1539938520009-1:1:1:1 | 
org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: 
tcp:///127.0.0.1:53365@61616
{code}
Spring Boot logs:
{code}
2018-10-19 10:42:00.209  INFO 65846 --- [   main] 
se.su.it.simlu.esb.App   : Started App in 1.849 seconds (JVM 
running for 2.386)
2018-10-19 10:42:33.129  WARN 65846 --- [0.1:61616@53365] 
org.apache.activemq.ActiveMQSession  : Closed consumer on Command, 
ID:kaka.it.su.se-53364-1539938520009-1:1:1:1
2018-10-19 10:42:50.247  INFO 65846 --- [enerContainer-1] 
se.su.it.simlu.esb.Consumer  : Message Received: Enter some text 
here for the message body...
2018-10-19 10:42:50.261  WARN 65846 --- [enerContainer-1] 
o.s.j.l.DefaultMessageListenerContainer  : Setup of JMS message listener 
invoker failed for destination 'su.it.linfra.simlu' - trying to recover. Cause: 
The Consumer is closed
2018-10-19 10:42:50.300  INFO 65846 --- [enerContainer-1] 
o.s.j.l.DefaultMessageListenerContainer  : Successfully refreshed JMS Connection
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AMQ-6977) JDBC Failure: Connection PoolingConnection:

2018-10-19 Thread Robert Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/AMQ-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656357#comment-16656357
 ] 

Robert Jin commented on AMQ-6977:
-

In which case the MQ will output such log: 'JDBC Failure: Connection 
PoolingConnection' ?

> JDBC Failure: Connection PoolingConnection: 
> 
>
> Key: AMQ-6977
> URL: https://issues.apache.org/jira/browse/AMQ-6977
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 5.15.2
> Environment: Red Hat Enterprise Linux Server release 7.2
>Reporter: Robert Jin
>Assignee: Jean-Baptiste Onofré
>Priority: Blocker
>
> The log like following appeared every other time(1h~several days). After four 
> months, the mq didn't work anymore.
> 2018-02-03 09:16:32,028 | WARN | JDBC Failure: Connection PoolingConnection: 
> org.apache.commons.pool.impl.GenericKeyedObjectPool@1697db40 is closed. | 
> org.apache.activemq.store.jdbc.JDBCPersistenceAdapter | ActiveMQ Transport: 
> tcp:///172.31.56.153:38168@61616
> java.sql.SQLException: Connection PoolingConnection: 
> org.apache.commons.pool.impl.GenericKeyedObjectPool@20e135f1 is closed.
> at 
> org.apache.commons.dbcp.DelegatingConnection.checkOpen(DelegatingConnection.java:519)[commons-dbcp-1.5.jar:1.5-SNAPSHOT]
> at 
> org.apache.commons.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:286)[commons-dbcp-1.5.jar:1.5-SNAPSHOT]
> at 
> org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:322)[commons-dbcp-1.5.jar:1.5-SNAPSHOT]
> at 
> org.apache.activemq.store.jdbc.TransactionContext$UnlockOnCloseConnection.prepareStatement(TransactionContext.java:308)[activemq-jdbc-store-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter.doAddMessage(DefaultJDBCAdapter.java:227)[activemq-jdbc-store-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.store.jdbc.JDBCMessageStore.addMessage(JDBCMessageStore.java:158)[activemq-jdbc-store-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.store.memory.MemoryTransactionStore.addMessage(MemoryTransactionStore.java:352)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.store.memory.MemoryTransactionStore$1.asyncAddQueueMessage(MemoryTransactionStore.java:159)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:854)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.region.Queue.send(Queue.java:743)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.region.AbstractRegion.send(AbstractRegion.java:505)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:459)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.jmx.ManagedRegionBroker.send(ManagedRegionBroker.java:293)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:154)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.CompositeDestinationBroker.send(CompositeDestinationBroker.java:96)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.TransactionBroker.send(TransactionBroker.java:293)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:154)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:154)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.TransportConnection.processMessage(TransportConnection.java:572)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.command.ActiveMQMessage.visit(ActiveMQMessage.java:768)[activemq-client-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:330)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:194)[activemq-broker-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)[activemq-client-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:125)[activemq-client-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:301)[activemq-client-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)[activemq-client-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:233)[activemq-client-5.15.2.jar:5.15.2]
> at 
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:215)[activemq-client-5.15.2.jar:5.15.2]
> at