Also, your question makes the assumption that files your broker stumbles upon when it becomes master were written by a previous master that shutdown cleanly, which is a rather optimistic assumption to make. On Apr 29, 2016 7:01 AM, "Tim Bain" <tb...@alumni.duke.edu> wrote:
> By reading the code ( > http://www.grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-kahadb-store/5.11.1/org/apache/activemq/store/kahadb/disk/journal/Journal.java#Journal.recoveryCheck%28org.apache.activemq.store.kahadb.disk.journal.DataFile%29), > I found that we're reading the data files to confirm that they're not > corrupted. That's the beauty of an open-source project: all of the source > code is available, so you can read it yourself and see what's being done. > > You should confirm that your OS will properly invalidate its cache entry > for a given file when that file changes; having KahaDB read stale cache > entries would be a very bad thing. > > Tim > We have found the main cause for this issue. Current Master and new Master > seem to do the exact same things on startup. > They read all the journals. > > It seems that the OS is caching the files in memory in some cases. > The master seems to have all the jorunal files in the cache. So when it > restarts it is fast. > The new master seems to not have the journals in cache so they have to be > transfered over the network (~10MB/s) in our case. So this > causes the difference in startup time. > > One big question remains: > Why are the whole journals read? Shouldn't a master that is shut down in a > controlled way leave a clean state, so on startup only the meta data index > is read? > > Christian > > On 15.02.2016 13:32, Christian Schneider wrote: > >> At a customer we are experiencing a very strange behaviour of activemq. >> >> We have two brokers on separate machines A and B. Storage is on a third >> machine and imported using nfsv4. >> Kahadb contains about 18 GB of journal files. >> >> Scenario 1: >> - Stop B >> - Stop A (master now) >> - Start A >> >> The start of A takes about 50 seconds. >> >> Scnario 2: >> - Stop A >> - Stop B (master now) >> - Start A >> >> The start of A takes about 5 minutes. >> >> In the slow case the log at debug level shows a big gap between adding >> the last queue (12:04) and running the ActiveMQ Journal Checkpoint Worker >> again after the queue creation (12:09). >> I have no idea what activemq is doing in the mean time but it seems to >> take several minutes. >> >> Can anyone explain this behaviour or explain what activemq does in the >> mean time? I would also be happy about pointer to the code. >> What I do not understand is why it makes a difference if the master is >> started again or if the slave is becoming master. As they both share the >> same kahadb over nfs I would expect the same startup >> behaviour. >> >> Christian >> >> ----- >> >> Slow start >> 2016-02-15 12:03:01,943 | INFO | Refreshing >> org.apache.activemq.xbean.XBeanBrokerFactory$1@17b14695: startup date >> [Mon Feb 15 12:03:01 CET 2016]; root of context hierarchy | >> org.apache.activemq.xbean.XBeanBrokerFactory$1 | main >> ... >> 2016-02-15 12:03:03,018 | INFO | >> PListStore:[/path/activemq/data/A/tmp_storage] started | >> org.apache.activemq.store.kahadb.plist.PListStoreImpl | main >> 2016-02-15 12:03:03,033 | INFO | ignoring zero length, partially >> initialised journal data file: db-1.log number = 1 , length = 0 | >> org.apache.activemq.store.kahadb.disk.journal.Journal | main >> 2016-02-15 12:03:03,177 | INFO | JobSchedulerStore:path/scheduler >> started | org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl >> | main >> 2016-02-15 12:03:03,180 | INFO | JobScheduler using directory: >> path/scheduler | org.apache.activemq.broker.BrokerService | main >> 2016-02-15 12:03:03,335 | INFO | Using Persistence Adapter: >> KahaDBPersistenceAdapter[/path] | org.apache.activemq.broker.BrokerService >> | main >> 2016-02-15 12:03:03,350 | INFO | JMX consoles can connect to >> service:jmx:rmi:///jndi/rmi://localhost:1617/jmxrmi | >> org.apache.activemq.broker.jmx.ManagementContext | JMX connector >> 2016-02-15 12:03:05,198 | INFO | Corrupt journal records found in >> 'path/db-1122.log' between offsets: 17893102..17893732 | >> org.apache.activemq.store.kahadb.disk.journal.Journal | main >> 2016-02-15 12:03:47,925 | INFO | KahaDB is version 5 | >> org.apache.activemq.store.kahadb.MessageDatabase | main >> .... >> 2016-02-15 12:03:49,233 | INFO | Recovering from the journal ... | >> org.apache.activemq.store.kahadb.MessageDatabase | main >> 2016-02-15 12:03:49,237 | INFO | Recovery replayed 20 operations from >> the journal in 0.094 seconds. | >> org.apache.activemq.store.kahadb.MessageDatabase | main >> 2016-02-15 12:04:26,863 | DEBUG | Checkpoint started. | >> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal >> Checkpoint Worker >> 2016-02-15 12:04:26,932 | DEBUG | Checkpoint done. | >> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal >> Checkpoint Worker >> 2016-02-15 12:04:26,940 | INFO | Apache ActiveMQ 5.10.2 (A, >> ID:A-59167-1455534183070-1:1) is starting | >> org.apache.activemq.broker.BrokerService | main >> ... >> 2016-02-15 12:04:30,289 | DEBUG | A adding destination: queue://q1.DLQ | >> org.apache.activemq.broker.region.AbstractRegion | main >> 2016-02-15 12:04:57,026 | DEBUG | queue://q1.DLQ expiring messages .. | >> org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler >> 2016-02-15 12:04:57,029 | DEBUG | q1.DLQ toPageIn: 5, Inflight: 0, >> pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 0, >> dequeueCount: 0, memUsage:0 | org.apache.activemq.broker.region.Queue | >> ActiveMQ Broker[A] Scheduler >> 2016-02-15 12:09:20,440 | DEBUG | Checkpoint started. | >> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal >> Checkpoint Worker >> 2016-02-15 12:09:20,476 | DEBUG | Checkpoint done. | >> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal >> Checkpoint Worker >> 2016-02-15 12:09:20,488 | DEBUG | queue://q1.DLQ expiring messages done. >> | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler >> >> >> > > -- > Christian Schneider > http://www.liquid-reality.de > > Open Source Architect > http://www.talend.com > >