Re: Replication Latency

Noe Detore Thu, 12 Jan 2017 03:14:15 -0800

 org.apache.accumulo.gc.replication.CloseWriteAheadLogReferences is what I
needed to find


thank you

On Wed, Jan 11, 2017 at 8:02 PM, Josh Elser <josh.el...@gmail.com> wrote:

> Did you look at the accumulo-gc log to actually correlate how often the
> class I sent is being executed?
>
> Noe Detore wrote:
>
>> To be fare, after writing the post I grepped the logs and found my WALs
>> rolling over on size before the time max.age threshold was hit. That is
>> the reason I did not see improvement in latency based on adjustment by
>> reducing the max.age.
>>
>> There is still an x factor from when a WAL is no longer written to by
>> the tserver as to when it actually gets replicated that I need to figure
>> out. For example my WALs appear to done(new wal created on tserver)
>> being written to in 3m, but replication is taking about 12 to 15 min to
>> complete. Even though the wal is not being written to after 3m I am not
>> seeing it ready for replication (closed: true) until after 13m.
>>
>>
>> On Wed, Jan 11, 2017 at 5:44 PM, Josh Elser <josh.el...@gmail.com
>> <mailto:josh.el...@gmail.com>> wrote:
>>
>>     See org.apache.accumulo.gc.replication.CloseWriteAheadLogReferences
>>     for where WALs are currently marked as "closed".
>>
>>     I don't recall the details, but I think there was some issue with
>>     trying to close them in TabletServerLogger.
>>
>>     Yes to your last question: if it were done in TabletServerLogger, it
>>     would be closed more quickly than done by the GC. The issue is
>>     whether or not it's actually safe to mark them as closed there. I
>>     just don't remember the internal WAL lifecycle well enough.
>>
>>
>>     Noe Detore wrote:
>>
>>         Hello,
>>
>>         I trying to influence replication latency with
>>         tserver.walog.max.age.
>>         But noticing no difference when setting the value low. Looking
>>         in the
>>         code of org.apache.accumulo.tserver.log.TabletServerLogger:
>>
>>         protected void closeForReplication(Collection<CommitSession>
>>         sessions) {
>>              // TODO We can close the WAL here for replication purposes
>>            }
>>
>>         This to do is called by :
>>         testLockAndRun(logSetLock, new TestCallWithWriteLock() {
>>                @Override
>>                boolean test() {
>>                  return (logSizeEstimate.get() > maxSize) ||
>>         ((System.currentTimeMillis() - createTime) > maxAge);
>>                }
>>
>>                @Override
>>                void withWriteLock() throws IOException {
>>                  close();
>>                  closeForReplication(sessions);
>>                }
>>              });
>>              return seq;
>>            }
>>
>>         I am still trying to understand what is happening here, but
>>         could this
>>         TODO be the reason replication status records are not being
>>         updated with
>>         'closed: true' sooner ?
>>
>>         Thank you
>>         Noe
>>
>>
>>

Re: Replication Latency

Reply via email to