Re: tlog replay

Rallavagu Thu, 08 Oct 2015 17:38:59 -0700

Erick,

Actually, configured autocommit to 15 seconds and openSearcher is set tofalse. Neither 2 nor 3 happened. However, softCommit is set to 10 min.


<autoCommit>
       <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
       <openSearcher>false</openSearcher>
     </autoCommit>

Working on upgrading to 5.3 which will take a bit of time and trying toget this under control until that time.


On 10/8/15 5:28 PM, Erick Erickson wrote:

right, so the scenario is
1> somehow you didn't do a hard commit (openSearcher=true or false
doesn't matter) for a really long time while indexing.
2> Solr abnormally terminated.
3> When Solr started back up it replayed the entire log.

How <1> happened is the mystery though. With a hard commit
(autocommit) interval of 15 seconds that's weird.

The message indicates something like that happened. In very recent
Solr versions, the log will have
progress messages printed that'll help see this is happening.

Best,
Erick

On Thu, Oct 8, 2015 at 12:23 PM, Rallavagu <rallav...@gmail.com> wrote:

As a follow up.

Eventually the tlog file is disappeared (could not track the time it took to
clear out completely). However, following messages were noticed in
follower's log.

5120638 [recoveryExecutor-14-thread-2] WARN org.apache.solr.update.UpdateLog
– Starting log replay tlog

On 10/7/15 8:29 PM, Erick Erickson wrote:


The only way I can account for such a large file off the top of my
head is if, for some reason,
the Solr on the node somehow was failing to index documents and kept
adding them to the
log for a loooonnngggg time. But how that would happen without the
node being in recovery
mode I'm not sure. I mean the Solr instance would have to be healthy
otherwise but just not
able to index docs which makes no sense.

The usual question here is whether there were any messages in the solr
log file indicating
problems while this built up.

tlogs will build up to very large sizes if there are very long hard
commit intervals, but I don't
see how that interval would be different on the leader and follower.

So color me puzzled.

Best,
Erick

On Wed, Oct 7, 2015 at 8:09 PM, Rallavagu <rallav...@gmail.com> wrote:


Thanks Erick.

Eventually, followers caught up but the 14G tlog file still persists and
they are healthy. Is there anything to look for? Will monitor and see how
long will it take before it disappears.

Evaluating move to Solr 5.3.

On 10/7/15 7:51 PM, Erick Erickson wrote:



Uhm, that's very weird. Updates are not applied from the tlog. Rather
the
raw doc is forwarded to the replica which both indexes the doc and
writes it to the local tlog. So having a 14G tlog on a follower but a
small
tlog on the leader is definitely strange, especially if it persists over
time.

I assume the follower is healthy? And does this very large tlog
disappear
after a while? I'd expect it to be aged out after a few commits of > 100
docs.

All that said, there have been a LOT of improvements since 4.6, so it
might
be something that's been addressed in the intervening time.

Best,
Erick



On Wed, Oct 7, 2015 at 7:39 PM, Rallavagu <rallav...@gmail.com> wrote:



Solr 4.6.1, single shard, 4 node cloud, 3 node zk

Like to understand the behavior better when large number of updates
happen
on leader and it generates huge tlog (14G sometimes in my case) on
other
nodes. At the same time leader's tlog is few KB. So, what is the rate
at
which the changes from transaction log are applied at nodes? The
autocommit
interval is set to 15 seconds after going through


https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Thanks

Re: tlog replay

Reply via email to