[jira] [Updated] (AMQ-7394) 'recovery' of messages does not take into account the PolicyMap/PolicyEntry 'memoryLimit', but takes 200 messages from the store

Reinald Verheij (Jira) Thu, 30 Jan 2020 07:39:47 -0800


     [ 
https://issues.apache.org/jira/browse/AMQ-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Reinald Verheij updated AMQ-7394:
---------------------------------
    Description: 
I believe this affects versions up to 5.15.11 too, since I don't see changes in 
the related code.

Perhaps this issue is similar to the KahaDB JIRA AMQ-7126 which was fixed in 
5.15.9?

At Infor we use activeMQ 5.15.2 currently with the JDBC Persistence Adapter.

When a queue (the tables in the DB for the queue) has many messages of a 
significant size (e.g. 200 messages of 5 MB) and I start the brokerService then 
it will 'recover' the messages. The first call will get 400 messages and when 
consuming and removing those subsequent calls will get 200 messages each 
*without* taking into account the configured memory limits per-broker or 
per-queue. So memory consumption is rather unconstrained here.

I believe this is because [JDBCMessageStore.java 
recoverNextMessages|https://github.com/apache/activemq/blob/master/activemq-jdbc-store/src/main/java/org/apache/activemq/store/jdbc/JDBCMessageStore.java]
 always returns 'true' (at line 368 currently) without asking 
{{listener.hasSpace()}} (or rather the new {{canRecoveryNextMessage()}}) 
{code:java}                @Override
                public boolean recoverMessage(long sequenceId, byte[] data) 
throws Exception {
                        Message msg = (Message)wireFormat.unmarshal(new 
ByteSequence(data));
                        msg.getMessageId().setBrokerSequenceId(sequenceId);
                        msg.getMessageId().setFutureOrSequenceLong(sequenceId);
                        msg.getMessageId().setEntryLocator(sequenceId);
                        listener.recoverMessage(msg);
                        trackLastRecovered(sequenceId, msg.getPriority());
                        return true;
                }{code}

... so the resultset iterator in DefaultJDBCAdapter::doRecoverNextMessages 
never jumps into the else branch to abort loading/recovering messages (at line 
1093 currently) {code:java}                while (rs.next() && count < 
maxReturned) {
                    if (listener.recoverMessage(rs.getLong(1), 
getBinaryData(rs, 2))) {
                        count++;
                    } else {
                        LOG.debug("Stopped recover next messages");
                        break;
                    }
                }{code}

I did a workaround by subclassing the TransactJDBCAdapter and then wrapping the 
listener to impose my own limit (however this is a separately configured limit; 
it is not coupled with the memory limits of the broker or the queue, so it is 
more a workaround than a solution; but it avoids customizing activeMQ code, the 
workaround is in our own product code.

My suggested improvement is:
* build the listener.hasSpace() (or rather the new 
{{canRecoveryNextMessage()}}) as was done for KahaDB in AMQ-7126
* AND do statement.cancel() if the listener does not have the space, to avoid 
performance trouble, see below.

h3. Explanation that leads to the suggestion to do statement.cancel() if 
listener.hasSpace() (or rather the new {{canRecoveryNextMessage()}}) returns 
false.
When the recover() produced fewer than requested messages due to size 
restrictions (e.g. it produced 10 messages instead of 200 because they were 
large) I noticed delays from consumer perspective of 4 seconds (with local sql 
server on ssd, so must be higher with clustered SQL server...). It appeared to 
be taking a long time to close the resultset. The answer is that the resultset 
needs to be exhausted during rs.close() because otherwise the connection can't 
be reused later. 

Source: someone asked this on the msft forum in 2009: {quote}I'm encountering a 
strange issue with the ResultSet.close() command with the new 2.0 jdbc drivers 
(not the JDBC 4.0 version). 

If I run a query that returns a large result set and just iterate through the 
first couple of rows and then attempt to call close() on the result set the 
close() call hangs for quite some time insted of closing almost instantly as 
past versions of the driver have done with this same code. {quote} and then 
David Olix responded saying that it's expected behavior in SQL Server driver as 
of 2.0, and suggested to cancel the sql statement. {quote}Hi,

Thank you for considering the SQL Server JDBC Driver 2.0.  And thank you for 
your post.  It illustrates a very interesting point about data retrieval with 
the 2.0 driver...

Short answer:
The behavior you are seeing is expected in the default configuration of the 2.0 
driver.  You should be able to revert to the previous behavior by setting the 
connection property "responseBuffering=full".  But read on for why that may not 
be your best solution...

Long answer:
When SQL Server returns a large number of rows from a SELECT query, the JDBC 
driver must process all of them to be able to execute the next query on the 
same connection.  Retrieving those rows from the server can take a substantial 
amount of time, especially over a slower network connection.  By default, older 
versions of the driver retrieved all of the rows into memory when the statement 
was executed.  The 1.2 driver introduced a new optional behavior called 
adaptive response buffering, which allows the driver to retrieve row data (and 
other results) from the server as the application accesses/demands it.  The 
benefit of adaptive buffering is lower client side memory usage -- much lower 
for large result sets -- with amortized latency across ResultSet accessor 
methods for apps that process all the result data in the order returned.  The 
2.0 driver uses adaptive response buffering by default.  In your case, the app 
accesses only the first few rows.  So ResultSet.close() takes longer because it 
must read and discard the rest of the rows.

That raises the question:
If your application is only interested in the first few rows, why does your 
query return thousands of them?

Performance and client-side memory usage would be much improved if you can 
structure the query to return only the potentially interesting rows.  Put 
another way, it's best not to query for data that you're not going to use.

If that is not an option, and you have no other choice but to query for all of 
the rows, there are other options that may yield improved performance still:

Consider cancelling the statement before closing the ResultSet.  Cancelling 
tells the driver to notify the server that it is not interested in any 
execution results (including remaining rows) that haven't already been 
received.  You have to weigh the cost of cancellation, which incurs a short 
round trip request/response to the server, against the cost of receiving the 
rest of the rows when using this technique, but it sounds like it would be a 
definite win if there are tens of thousands of rows left.

Or consider using a server-side cursor when executing the query.  You can do 
this by specifying 
com.microsoft.sqlserver.jdbc.SQLServerResultSet.TYPE_SS_SERVER_CURSOR_FORWARD_ONLY
 rather than java.sql.ResultSet.TYPE_FORWARD_ONLY for the scrollability 
parameter to the createStatement()/prepareStatement() call.  Server-side 
cursors allow the driver to retrieve rows a few at a time.  There's a cost with 
server cursors too: a round trip to fetch each block of rows.  But closing the 
ResultSet closes the server cursor at which point no further rows are returned. 
 But before using a server cursor, make sure the query is cursorable, or SQL 
Server will just return all the rows anyway...

The last option is the one from the short answer: use full response buffering 
rather than adaptive response buffering.  This may be the least desirable 
solution, however, since the app still has to wait for all the rows to be 
received from the server (in executeQuery() though, rather than 
ResultSet.close()).  But rather than discarding the uninteresting rows, it has 
to buffer them, which could lead to an OutOfMemoryError if the result set is 
large enough.

HTH,
--David Olix [SQL Server]{quote}
Source (not sure if URL is stable, hence the full quote): 
https://social.msdn.microsoft.com/Forums/security/en-US/7ceee9a8-09d8-4fbd-9918-0ca065fa182e/resultsetclose-hangs-with-new-jdbc-20?forum=sqldataaccess

  was:
I believe this affects versions up to 5.15.11 too, since I don't see changes in 
the related code.

Perhaps this issue is similar to the KahaDB JIRA AMQ-7126 which was fixed in 
5.15.9?

At Infor we use activeMQ 5.15.2 currently with the JDBC Persistence Adapter.

When a queue (the tables in the DB for the queue) has many messages of a 
significant size (e.g. 200 messages of 5 MB) and I start the brokerService then 
it will 'recover' the messages. The first call will get 400 messages and when 
consuming and removing those subsequent calls will get 200 messages each 
*without* taking into account the configured memory limits per-broker or 
per-queue. So memory consumption is rather unconstrained here.

I believe this is because [JDBCMessageStore.java 
recoverNextMessages|https://github.com/apache/activemq/blob/master/activemq-jdbc-store/src/main/java/org/apache/activemq/store/jdbc/JDBCMessageStore.java]
 always returns 'true' (at line 368 currently) without asking 
{{listener.hasSpace()}} (or rather the new {{canRecoverNextMessage()}}) 
{code:java}                @Override
                public boolean recoverMessage(long sequenceId, byte[] data) 
throws Exception {
                        Message msg = (Message)wireFormat.unmarshal(new 
ByteSequence(data));
                        msg.getMessageId().setBrokerSequenceId(sequenceId);
                        msg.getMessageId().setFutureOrSequenceLong(sequenceId);
                        msg.getMessageId().setEntryLocator(sequenceId);
                        listener.recoverMessage(msg);
                        trackLastRecovered(sequenceId, msg.getPriority());
                        return true;
                }{code}

... so the resultset iterator in DefaultJDBCAdapter::doRecoverNextMessages 
never jumps into the else branch to abort loading/recovering messages (at line 
1093 currently) {code:java}                while (rs.next() && count < 
maxReturned) {
                    if (listener.recoverMessage(rs.getLong(1), 
getBinaryData(rs, 2))) {
                        count++;
                    } else {
                        LOG.debug("Stopped recover next messages");
                        break;
                    }
                }{code}

I did a workaround by subclassing the TransactJDBCAdapter and then wrapping the 
listener to impose my own limit (however this is a separately configured limit; 
it is not coupled with the memory limits of the broker or the queue, so it is 
more a workaround than a solution; but it avoids customizing activeMQ code, the 
workaround is in our own product code.

My suggested improvement is:
* build the listener.hasSpace() (or rather the new {{canRecoverNextMessage()}}) 
as was done for KahaDB in AMQ-7126
* AND do statement.cancel() if the listener does not have the space, to avoid 
performance trouble, see below.

h3. Explanation that leads to the suggestion to do statement.cancel() if 
listener.hasSpace() (or rather the new {{canRecoverNextMessage()}}) returns 
false.
When the recover() produced fewer than requested messages due to size 
restrictions (e.g. it produced 10 messages instead of 200 because they were 
large) I noticed delays from consumer perspective of 4 seconds (with local sql 
server on ssd, so must be higher with clustered SQL server...). It appeared to 
be taking a long time to close the resultset. The answer is that the resultset 
needs to be exhausted during rs.close() because otherwise the connection can't 
be reused later. 

Source: someone asked this on the msft forum in 2009: {quote}I'm encountering a 
strange issue with the ResultSet.close() command with the new 2.0 jdbc drivers 
(not the JDBC 4.0 version). 

If I run a query that returns a large result set and just iterate through the 
first couple of rows and then attempt to call close() on the result set the 
close() call hangs for quite some time insted of closing almost instantly as 
past versions of the driver have done with this same code. {quote} and then 
David Olix responded saying that it's expected behavior in SQL Server driver as 
of 2.0, and suggested to cancel the sql statement. {quote}Hi,

Thank you for considering the SQL Server JDBC Driver 2.0.  And thank you for 
your post.  It illustrates a very interesting point about data retrieval with 
the 2.0 driver...

Short answer:
The behavior you are seeing is expected in the default configuration of the 2.0 
driver.  You should be able to revert to the previous behavior by setting the 
connection property "responseBuffering=full".  But read on for why that may not 
be your best solution...

Long answer:
When SQL Server returns a large number of rows from a SELECT query, the JDBC 
driver must process all of them to be able to execute the next query on the 
same connection.  Retrieving those rows from the server can take a substantial 
amount of time, especially over a slower network connection.  By default, older 
versions of the driver retrieved all of the rows into memory when the statement 
was executed.  The 1.2 driver introduced a new optional behavior called 
adaptive response buffering, which allows the driver to retrieve row data (and 
other results) from the server as the application accesses/demands it.  The 
benefit of adaptive buffering is lower client side memory usage -- much lower 
for large result sets -- with amortized latency across ResultSet accessor 
methods for apps that process all the result data in the order returned.  The 
2.0 driver uses adaptive response buffering by default.  In your case, the app 
accesses only the first few rows.  So ResultSet.close() takes longer because it 
must read and discard the rest of the rows.

That raises the question:
If your application is only interested in the first few rows, why does your 
query return thousands of them?

Performance and client-side memory usage would be much improved if you can 
structure the query to return only the potentially interesting rows.  Put 
another way, it's best not to query for data that you're not going to use.

If that is not an option, and you have no other choice but to query for all of 
the rows, there are other options that may yield improved performance still:

Consider cancelling the statement before closing the ResultSet.  Cancelling 
tells the driver to notify the server that it is not interested in any 
execution results (including remaining rows) that haven't already been 
received.  You have to weigh the cost of cancellation, which incurs a short 
round trip request/response to the server, against the cost of receiving the 
rest of the rows when using this technique, but it sounds like it would be a 
definite win if there are tens of thousands of rows left.

Or consider using a server-side cursor when executing the query.  You can do 
this by specifying 
com.microsoft.sqlserver.jdbc.SQLServerResultSet.TYPE_SS_SERVER_CURSOR_FORWARD_ONLY
 rather than java.sql.ResultSet.TYPE_FORWARD_ONLY for the scrollability 
parameter to the createStatement()/prepareStatement() call.  Server-side 
cursors allow the driver to retrieve rows a few at a time.  There's a cost with 
server cursors too: a round trip to fetch each block of rows.  But closing the 
ResultSet closes the server cursor at which point no further rows are returned. 
 But before using a server cursor, make sure the query is cursorable, or SQL 
Server will just return all the rows anyway...

The last option is the one from the short answer: use full response buffering 
rather than adaptive response buffering.  This may be the least desirable 
solution, however, since the app still has to wait for all the rows to be 
received from the server (in executeQuery() though, rather than 
ResultSet.close()).  But rather than discarding the uninteresting rows, it has 
to buffer them, which could lead to an OutOfMemoryError if the result set is 
large enough.

HTH,
--David Olix [SQL Server]{quote}
Source (not sure if URL is stable, hence the full quote): 
https://social.msdn.microsoft.com/Forums/security/en-US/7ceee9a8-09d8-4fbd-9918-0ca065fa182e/resultsetclose-hangs-with-new-jdbc-20?forum=sqldataaccess


> 'recovery' of messages does not take into account the PolicyMap/PolicyEntry 
> 'memoryLimit', but takes 200 messages from the store
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-7394
>                 URL: https://issues.apache.org/jira/browse/AMQ-7394
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 5.15.2
>            Reporter: Reinald Verheij
>            Priority: Major
>
> I believe this affects versions up to 5.15.11 too, since I don't see changes 
> in the related code.
> Perhaps this issue is similar to the KahaDB JIRA AMQ-7126 which was fixed in 
> 5.15.9?
> At Infor we use activeMQ 5.15.2 currently with the JDBC Persistence Adapter.
> When a queue (the tables in the DB for the queue) has many messages of a 
> significant size (e.g. 200 messages of 5 MB) and I start the brokerService 
> then it will 'recover' the messages. The first call will get 400 messages and 
> when consuming and removing those subsequent calls will get 200 messages each 
> *without* taking into account the configured memory limits per-broker or 
> per-queue. So memory consumption is rather unconstrained here.
> I believe this is because [JDBCMessageStore.java 
> recoverNextMessages|https://github.com/apache/activemq/blob/master/activemq-jdbc-store/src/main/java/org/apache/activemq/store/jdbc/JDBCMessageStore.java]
>  always returns 'true' (at line 368 currently) without asking 
> {{listener.hasSpace()}} (or rather the new {{canRecoveryNextMessage()}}) 
> {code:java}                @Override
>                 public boolean recoverMessage(long sequenceId, byte[] data) 
> throws Exception {
>                         Message msg = (Message)wireFormat.unmarshal(new 
> ByteSequence(data));
>                         msg.getMessageId().setBrokerSequenceId(sequenceId);
>                         
> msg.getMessageId().setFutureOrSequenceLong(sequenceId);
>                         msg.getMessageId().setEntryLocator(sequenceId);
>                         listener.recoverMessage(msg);
>                         trackLastRecovered(sequenceId, msg.getPriority());
>                         return true;
>                 }{code}
> ... so the resultset iterator in DefaultJDBCAdapter::doRecoverNextMessages 
> never jumps into the else branch to abort loading/recovering messages (at 
> line 1093 currently) {code:java}                while (rs.next() && count < 
> maxReturned) {
>                     if (listener.recoverMessage(rs.getLong(1), 
> getBinaryData(rs, 2))) {
>                         count++;
>                     } else {
>                         LOG.debug("Stopped recover next messages");
>                         break;
>                     }
>                 }{code}
> I did a workaround by subclassing the TransactJDBCAdapter and then wrapping 
> the listener to impose my own limit (however this is a separately configured 
> limit; it is not coupled with the memory limits of the broker or the queue, 
> so it is more a workaround than a solution; but it avoids customizing 
> activeMQ code, the workaround is in our own product code.
> My suggested improvement is:
> * build the listener.hasSpace() (or rather the new 
> {{canRecoveryNextMessage()}}) as was done for KahaDB in AMQ-7126
> * AND do statement.cancel() if the listener does not have the space, to avoid 
> performance trouble, see below.
> h3. Explanation that leads to the suggestion to do statement.cancel() if 
> listener.hasSpace() (or rather the new {{canRecoveryNextMessage()}}) returns 
> false.
> When the recover() produced fewer than requested messages due to size 
> restrictions (e.g. it produced 10 messages instead of 200 because they were 
> large) I noticed delays from consumer perspective of 4 seconds (with local 
> sql server on ssd, so must be higher with clustered SQL server...). It 
> appeared to be taking a long time to close the resultset. The answer is that 
> the resultset needs to be exhausted during rs.close() because otherwise the 
> connection can't be reused later. 
> Source: someone asked this on the msft forum in 2009: {quote}I'm encountering 
> a strange issue with the ResultSet.close() command with the new 2.0 jdbc 
> drivers (not the JDBC 4.0 version). 
> If I run a query that returns a large result set and just iterate through the 
> first couple of rows and then attempt to call close() on the result set the 
> close() call hangs for quite some time insted of closing almost instantly as 
> past versions of the driver have done with this same code. {quote} and then 
> David Olix responded saying that it's expected behavior in SQL Server driver 
> as of 2.0, and suggested to cancel the sql statement. {quote}Hi,
> Thank you for considering the SQL Server JDBC Driver 2.0.  And thank you for 
> your post.  It illustrates a very interesting point about data retrieval with 
> the 2.0 driver...
> Short answer:
> The behavior you are seeing is expected in the default configuration of the 
> 2.0 driver.  You should be able to revert to the previous behavior by setting 
> the connection property "responseBuffering=full".  But read on for why that 
> may not be your best solution...
> Long answer:
> When SQL Server returns a large number of rows from a SELECT query, the JDBC 
> driver must process all of them to be able to execute the next query on the 
> same connection.  Retrieving those rows from the server can take a 
> substantial amount of time, especially over a slower network connection.  By 
> default, older versions of the driver retrieved all of the rows into memory 
> when the statement was executed.  The 1.2 driver introduced a new optional 
> behavior called adaptive response buffering, which allows the driver to 
> retrieve row data (and other results) from the server as the application 
> accesses/demands it.  The benefit of adaptive buffering is lower client side 
> memory usage -- much lower for large result sets -- with amortized latency 
> across ResultSet accessor methods for apps that process all the result data 
> in the order returned.  The 2.0 driver uses adaptive response buffering by 
> default.  In your case, the app accesses only the first few rows.  So 
> ResultSet.close() takes longer because it must read and discard the rest of 
> the rows.
> That raises the question:
> If your application is only interested in the first few rows, why does your 
> query return thousands of them?
> Performance and client-side memory usage would be much improved if you can 
> structure the query to return only the potentially interesting rows.  Put 
> another way, it's best not to query for data that you're not going to use.
> If that is not an option, and you have no other choice but to query for all 
> of the rows, there are other options that may yield improved performance 
> still:
> Consider cancelling the statement before closing the ResultSet.  Cancelling 
> tells the driver to notify the server that it is not interested in any 
> execution results (including remaining rows) that haven't already been 
> received.  You have to weigh the cost of cancellation, which incurs a short 
> round trip request/response to the server, against the cost of receiving the 
> rest of the rows when using this technique, but it sounds like it would be a 
> definite win if there are tens of thousands of rows left.
> Or consider using a server-side cursor when executing the query.  You can do 
> this by specifying 
> com.microsoft.sqlserver.jdbc.SQLServerResultSet.TYPE_SS_SERVER_CURSOR_FORWARD_ONLY
>  rather than java.sql.ResultSet.TYPE_FORWARD_ONLY for the scrollability 
> parameter to the createStatement()/prepareStatement() call.  Server-side 
> cursors allow the driver to retrieve rows a few at a time.  There's a cost 
> with server cursors too: a round trip to fetch each block of rows.  But 
> closing the ResultSet closes the server cursor at which point no further rows 
> are returned.  But before using a server cursor, make sure the query is 
> cursorable, or SQL Server will just return all the rows anyway...
> The last option is the one from the short answer: use full response buffering 
> rather than adaptive response buffering.  This may be the least desirable 
> solution, however, since the app still has to wait for all the rows to be 
> received from the server (in executeQuery() though, rather than 
> ResultSet.close()).  But rather than discarding the uninteresting rows, it 
> has to buffer them, which could lead to an OutOfMemoryError if the result set 
> is large enough.
> HTH,
> --David Olix [SQL Server]{quote}
> Source (not sure if URL is stable, hence the full quote): 
> https://social.msdn.microsoft.com/Forums/security/en-US/7ceee9a8-09d8-4fbd-9918-0ca065fa182e/resultsetclose-hangs-with-new-jdbc-20?forum=sqldataaccess



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (AMQ-7394) 'recovery' of messages does not take into account the PolicyMap/PolicyEntry 'memoryLimit', but takes 200 messages from the store

Reply via email to