I am trying to use persistent storage with a large number of messages and I think I'm running into trouble with the JDBC statements.
In particular, I'm trying to use Derby, but I think there are some general performance problems with the SQL statements. 1. Many of the SELECT statements actually return many rows. 2a. Right before the JDBC call is made, setMaxRows() should be called on the statement with the maximum number of rows to be captured. 2b. A "TOP" or "LIMIT" clause should be a part of these SQL SELECT statements. Otherwise, the database can attempt to sort a huge number of rows since these also contain an ORDER BY. Unfortunately, there is no standard syntax for this -- it varies depending on the RDBMS. The current implementation stops after reading x number rows, but it is too late, the database may have already done a lot of work (demonstrated below). For some RDBMS, the numeric argument to TOP/LIMIT may be parameterized, other times it must be put directly in the SQL string. I think the statements affected are: getFindDurableSubMessagesStatement getNextDurableSubscriberMessageStatement getPrevDurableSubscriberMessageIdStatement getNextDurableSubscriberMessageIdStatement -- Note: The next and prev statements can probably be rewritten in a way that uses MIN or MAX to avoid needing to use TOP/LIMIT. The most problems come with the getFindDurableSubMessagesStatement. 2. Why do many of these statements bother to join to the ACKS table? Some of them do not obtain any information from these tables regarding which messages are to be retrieved (maybe this is needed to make sure an entry exists there, but can't this be figured out just by the IDs/messages?) 3. I believe the index on (MSGID_PROD,MSGID_SEQ) can be declared as a UNIQUE INDEX. This helps out some optimizers, not sure how much it would affect ActiveMQ. 4. The prepared statements probably should be cached/pooled (in whatever way is thread-safe), but I think this is the least of the problems. Really bad performance due to this was observed with Derby. Happens even when I submit the commands directly to Derby, bypassing ActiveMQ. The setMaxRows did not fix it, and Derby does not provide any form of TOP or LIMIT. :( Tried against the latest release of Derby and no luck there (in fact, I was getting out of memory error manually issuing the SQL statements to the database). I am wondering as to what kind of persistence people here recommend? I may need to persist millions of records. *** Relevant part of the optimizer log (notice that Rows Visited = 100299, even though only 100 are desired). Right result set: Table Scan ResultSet for ACTIVEMQ_MSGS at read committed isolation level using instantaneous share row locking chosen by the optimizer Number of opens = 1 Rows seen = 100287 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 1, 5} Number of columns fetched=3 Number of pages visited=4777 Number of rows qualified=100287 Number of rows visited=100299 Scan type=heap start position: null stop position: null qualifiers: -- View this message in context: http://www.nabble.com/Statements.java%2C-etc.-Performance-tf2662372.html#a7425760 Sent from the ActiveMQ - User mailing list archive at Nabble.com.