org.apache.accumulo.gc.replication.CloseWriteAheadLogReferences is what I needed to find
thank you On Wed, Jan 11, 2017 at 8:02 PM, Josh Elser <josh.el...@gmail.com> wrote: > Did you look at the accumulo-gc log to actually correlate how often the > class I sent is being executed? > > Noe Detore wrote: > >> To be fare, after writing the post I grepped the logs and found my WALs >> rolling over on size before the time max.age threshold was hit. That is >> the reason I did not see improvement in latency based on adjustment by >> reducing the max.age. >> >> There is still an x factor from when a WAL is no longer written to by >> the tserver as to when it actually gets replicated that I need to figure >> out. For example my WALs appear to done(new wal created on tserver) >> being written to in 3m, but replication is taking about 12 to 15 min to >> complete. Even though the wal is not being written to after 3m I am not >> seeing it ready for replication (closed: true) until after 13m. >> >> >> On Wed, Jan 11, 2017 at 5:44 PM, Josh Elser <josh.el...@gmail.com >> <mailto:josh.el...@gmail.com>> wrote: >> >> See org.apache.accumulo.gc.replication.CloseWriteAheadLogReferences >> for where WALs are currently marked as "closed". >> >> I don't recall the details, but I think there was some issue with >> trying to close them in TabletServerLogger. >> >> Yes to your last question: if it were done in TabletServerLogger, it >> would be closed more quickly than done by the GC. The issue is >> whether or not it's actually safe to mark them as closed there. I >> just don't remember the internal WAL lifecycle well enough. >> >> >> Noe Detore wrote: >> >> Hello, >> >> I trying to influence replication latency with >> tserver.walog.max.age. >> But noticing no difference when setting the value low. Looking >> in the >> code of org.apache.accumulo.tserver.log.TabletServerLogger: >> >> protected void closeForReplication(Collection<CommitSession> >> sessions) { >> // TODO We can close the WAL here for replication purposes >> } >> >> This to do is called by : >> testLockAndRun(logSetLock, new TestCallWithWriteLock() { >> @Override >> boolean test() { >> return (logSizeEstimate.get() > maxSize) || >> ((System.currentTimeMillis() - createTime) > maxAge); >> } >> >> @Override >> void withWriteLock() throws IOException { >> close(); >> closeForReplication(sessions); >> } >> }); >> return seq; >> } >> >> I am still trying to understand what is happening here, but >> could this >> TODO be the reason replication status records are not being >> updated with >> 'closed: true' sooner ? >> >> Thank you >> Noe >> >> >>