Re: tlog replay

2015-10-08 Thread Rallavagu

As a follow up.

Eventually the tlog file is disappeared (could not track the time it 
took to clear out completely). However, following messages were noticed 
in follower's log.


5120638 [recoveryExecutor-14-thread-2] WARN 
org.apache.solr.update.UpdateLog  – Starting log replay tlog


On 10/7/15 8:29 PM, Erick Erickson wrote:

The only way I can account for such a large file off the top of my
head is if, for some reason,
the Solr on the node somehow was failing to index documents and kept
adding them to the
log for a lnnn time. But how that would happen without the
node being in recovery
mode I'm not sure. I mean the Solr instance would have to be healthy
otherwise but just not
able to index docs which makes no sense.

The usual question here is whether there were any messages in the solr
log file indicating
problems while this built up.

tlogs will build up to very large sizes if there are very long hard
commit intervals, but I don't
see how that interval would be different on the leader and follower.

So color me puzzled.

Best,
Erick

On Wed, Oct 7, 2015 at 8:09 PM, Rallavagu  wrote:

Thanks Erick.

Eventually, followers caught up but the 14G tlog file still persists and
they are healthy. Is there anything to look for? Will monitor and see how
long will it take before it disappears.

Evaluating move to Solr 5.3.

On 10/7/15 7:51 PM, Erick Erickson wrote:


Uhm, that's very weird. Updates are not applied from the tlog. Rather the
raw doc is forwarded to the replica which both indexes the doc and
writes it to the local tlog. So having a 14G tlog on a follower but a
small
tlog on the leader is definitely strange, especially if it persists over
time.

I assume the follower is healthy? And does this very large tlog disappear
after a while? I'd expect it to be aged out after a few commits of > 100
docs.

All that said, there have been a LOT of improvements since 4.6, so it
might
be something that's been addressed in the intervening time.

Best,
Erick



On Wed, Oct 7, 2015 at 7:39 PM, Rallavagu  wrote:


Solr 4.6.1, single shard, 4 node cloud, 3 node zk

Like to understand the behavior better when large number of updates
happen
on leader and it generates huge tlog (14G sometimes in my case) on other
nodes. At the same time leader's tlog is few KB. So, what is the rate at
which the changes from transaction log are applied at nodes? The
autocommit
interval is set to 15 seconds after going through

https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Thanks


Re: tlog replay

2015-10-08 Thread Rallavagu

Erick,

Actually, configured autocommit to 15 seconds and openSearcher is set to 
false. Neither 2 nor 3 happened. However, softCommit is set to 10 min.



   ${solr.autoCommit.maxTime:15000}
   false
 

Working on upgrading to 5.3 which will take a bit of time and trying to 
get this under control until that time.


On 10/8/15 5:28 PM, Erick Erickson wrote:

right, so the scenario is
1> somehow you didn't do a hard commit (openSearcher=true or false
doesn't matter) for a really long time while indexing.
2> Solr abnormally terminated.
3> When Solr started back up it replayed the entire log.

How <1> happened is the mystery though. With a hard commit
(autocommit) interval of 15 seconds that's weird.

The message indicates something like that happened. In very recent
Solr versions, the log will have
progress messages printed that'll help see this is happening.

Best,
Erick

On Thu, Oct 8, 2015 at 12:23 PM, Rallavagu  wrote:

As a follow up.

Eventually the tlog file is disappeared (could not track the time it took to
clear out completely). However, following messages were noticed in
follower's log.

5120638 [recoveryExecutor-14-thread-2] WARN org.apache.solr.update.UpdateLog
– Starting log replay tlog

On 10/7/15 8:29 PM, Erick Erickson wrote:


The only way I can account for such a large file off the top of my
head is if, for some reason,
the Solr on the node somehow was failing to index documents and kept
adding them to the
log for a lnnn time. But how that would happen without the
node being in recovery
mode I'm not sure. I mean the Solr instance would have to be healthy
otherwise but just not
able to index docs which makes no sense.

The usual question here is whether there were any messages in the solr
log file indicating
problems while this built up.

tlogs will build up to very large sizes if there are very long hard
commit intervals, but I don't
see how that interval would be different on the leader and follower.

So color me puzzled.

Best,
Erick

On Wed, Oct 7, 2015 at 8:09 PM, Rallavagu  wrote:


Thanks Erick.

Eventually, followers caught up but the 14G tlog file still persists and
they are healthy. Is there anything to look for? Will monitor and see how
long will it take before it disappears.

Evaluating move to Solr 5.3.

On 10/7/15 7:51 PM, Erick Erickson wrote:



Uhm, that's very weird. Updates are not applied from the tlog. Rather
the
raw doc is forwarded to the replica which both indexes the doc and
writes it to the local tlog. So having a 14G tlog on a follower but a
small
tlog on the leader is definitely strange, especially if it persists over
time.

I assume the follower is healthy? And does this very large tlog
disappear
after a while? I'd expect it to be aged out after a few commits of > 100
docs.

All that said, there have been a LOT of improvements since 4.6, so it
might
be something that's been addressed in the intervening time.

Best,
Erick



On Wed, Oct 7, 2015 at 7:39 PM, Rallavagu  wrote:



Solr 4.6.1, single shard, 4 node cloud, 3 node zk

Like to understand the behavior better when large number of updates
happen
on leader and it generates huge tlog (14G sometimes in my case) on
other
nodes. At the same time leader's tlog is few KB. So, what is the rate
at
which the changes from transaction log are applied at nodes? The
autocommit
interval is set to 15 seconds after going through


https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Thanks


Re: tlog replay

2015-10-08 Thread Erick Erickson
right, so the scenario is
1> somehow you didn't do a hard commit (openSearcher=true or false
doesn't matter) for a really long time while indexing.
2> Solr abnormally terminated.
3> When Solr started back up it replayed the entire log.

How <1> happened is the mystery though. With a hard commit
(autocommit) interval of 15 seconds that's weird.

The message indicates something like that happened. In very recent
Solr versions, the log will have
progress messages printed that'll help see this is happening.

Best,
Erick

On Thu, Oct 8, 2015 at 12:23 PM, Rallavagu  wrote:
> As a follow up.
>
> Eventually the tlog file is disappeared (could not track the time it took to
> clear out completely). However, following messages were noticed in
> follower's log.
>
> 5120638 [recoveryExecutor-14-thread-2] WARN org.apache.solr.update.UpdateLog
> – Starting log replay tlog
>
> On 10/7/15 8:29 PM, Erick Erickson wrote:
>>
>> The only way I can account for such a large file off the top of my
>> head is if, for some reason,
>> the Solr on the node somehow was failing to index documents and kept
>> adding them to the
>> log for a lnnn time. But how that would happen without the
>> node being in recovery
>> mode I'm not sure. I mean the Solr instance would have to be healthy
>> otherwise but just not
>> able to index docs which makes no sense.
>>
>> The usual question here is whether there were any messages in the solr
>> log file indicating
>> problems while this built up.
>>
>> tlogs will build up to very large sizes if there are very long hard
>> commit intervals, but I don't
>> see how that interval would be different on the leader and follower.
>>
>> So color me puzzled.
>>
>> Best,
>> Erick
>>
>> On Wed, Oct 7, 2015 at 8:09 PM, Rallavagu  wrote:
>>>
>>> Thanks Erick.
>>>
>>> Eventually, followers caught up but the 14G tlog file still persists and
>>> they are healthy. Is there anything to look for? Will monitor and see how
>>> long will it take before it disappears.
>>>
>>> Evaluating move to Solr 5.3.
>>>
>>> On 10/7/15 7:51 PM, Erick Erickson wrote:


 Uhm, that's very weird. Updates are not applied from the tlog. Rather
 the
 raw doc is forwarded to the replica which both indexes the doc and
 writes it to the local tlog. So having a 14G tlog on a follower but a
 small
 tlog on the leader is definitely strange, especially if it persists over
 time.

 I assume the follower is healthy? And does this very large tlog
 disappear
 after a while? I'd expect it to be aged out after a few commits of > 100
 docs.

 All that said, there have been a LOT of improvements since 4.6, so it
 might
 be something that's been addressed in the intervening time.

 Best,
 Erick



 On Wed, Oct 7, 2015 at 7:39 PM, Rallavagu  wrote:
>
>
> Solr 4.6.1, single shard, 4 node cloud, 3 node zk
>
> Like to understand the behavior better when large number of updates
> happen
> on leader and it generates huge tlog (14G sometimes in my case) on
> other
> nodes. At the same time leader's tlog is few KB. So, what is the rate
> at
> which the changes from transaction log are applied at nodes? The
> autocommit
> interval is set to 15 seconds after going through
>
>
> https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> Thanks


Re: tlog replay

2015-10-07 Thread Erick Erickson
The only way I can account for such a large file off the top of my
head is if, for some reason,
the Solr on the node somehow was failing to index documents and kept
adding them to the
log for a lnnn time. But how that would happen without the
node being in recovery
mode I'm not sure. I mean the Solr instance would have to be healthy
otherwise but just not
able to index docs which makes no sense.

The usual question here is whether there were any messages in the solr
log file indicating
problems while this built up.

tlogs will build up to very large sizes if there are very long hard
commit intervals, but I don't
see how that interval would be different on the leader and follower.

So color me puzzled.

Best,
Erick

On Wed, Oct 7, 2015 at 8:09 PM, Rallavagu  wrote:
> Thanks Erick.
>
> Eventually, followers caught up but the 14G tlog file still persists and
> they are healthy. Is there anything to look for? Will monitor and see how
> long will it take before it disappears.
>
> Evaluating move to Solr 5.3.
>
> On 10/7/15 7:51 PM, Erick Erickson wrote:
>>
>> Uhm, that's very weird. Updates are not applied from the tlog. Rather the
>> raw doc is forwarded to the replica which both indexes the doc and
>> writes it to the local tlog. So having a 14G tlog on a follower but a
>> small
>> tlog on the leader is definitely strange, especially if it persists over
>> time.
>>
>> I assume the follower is healthy? And does this very large tlog disappear
>> after a while? I'd expect it to be aged out after a few commits of > 100
>> docs.
>>
>> All that said, there have been a LOT of improvements since 4.6, so it
>> might
>> be something that's been addressed in the intervening time.
>>
>> Best,
>> Erick
>>
>>
>>
>> On Wed, Oct 7, 2015 at 7:39 PM, Rallavagu  wrote:
>>>
>>> Solr 4.6.1, single shard, 4 node cloud, 3 node zk
>>>
>>> Like to understand the behavior better when large number of updates
>>> happen
>>> on leader and it generates huge tlog (14G sometimes in my case) on other
>>> nodes. At the same time leader's tlog is few KB. So, what is the rate at
>>> which the changes from transaction log are applied at nodes? The
>>> autocommit
>>> interval is set to 15 seconds after going through
>>>
>>> https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>>
>>> Thanks


Re: tlog replay

2015-10-07 Thread Erick Erickson
Uhm, that's very weird. Updates are not applied from the tlog. Rather the
raw doc is forwarded to the replica which both indexes the doc and
writes it to the local tlog. So having a 14G tlog on a follower but a small
tlog on the leader is definitely strange, especially if it persists over time.

I assume the follower is healthy? And does this very large tlog disappear
after a while? I'd expect it to be aged out after a few commits of > 100 docs.

All that said, there have been a LOT of improvements since 4.6, so it might
be something that's been addressed in the intervening time.

Best,
Erick



On Wed, Oct 7, 2015 at 7:39 PM, Rallavagu  wrote:
> Solr 4.6.1, single shard, 4 node cloud, 3 node zk
>
> Like to understand the behavior better when large number of updates happen
> on leader and it generates huge tlog (14G sometimes in my case) on other
> nodes. At the same time leader's tlog is few KB. So, what is the rate at
> which the changes from transaction log are applied at nodes? The autocommit
> interval is set to 15 seconds after going through
> https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> Thanks


Re: tlog replay

2015-10-07 Thread Rallavagu

Thanks Erick.

Eventually, followers caught up but the 14G tlog file still persists and 
they are healthy. Is there anything to look for? Will monitor and see 
how long will it take before it disappears.


Evaluating move to Solr 5.3.

On 10/7/15 7:51 PM, Erick Erickson wrote:

Uhm, that's very weird. Updates are not applied from the tlog. Rather the
raw doc is forwarded to the replica which both indexes the doc and
writes it to the local tlog. So having a 14G tlog on a follower but a small
tlog on the leader is definitely strange, especially if it persists over time.

I assume the follower is healthy? And does this very large tlog disappear
after a while? I'd expect it to be aged out after a few commits of > 100 docs.

All that said, there have been a LOT of improvements since 4.6, so it might
be something that's been addressed in the intervening time.

Best,
Erick



On Wed, Oct 7, 2015 at 7:39 PM, Rallavagu  wrote:

Solr 4.6.1, single shard, 4 node cloud, 3 node zk

Like to understand the behavior better when large number of updates happen
on leader and it generates huge tlog (14G sometimes in my case) on other
nodes. At the same time leader's tlog is few KB. So, what is the rate at
which the changes from transaction log are applied at nodes? The autocommit
interval is set to 15 seconds after going through
https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Thanks


Re: Tlog replay

2015-07-08 Thread Alessandro Benedetti
Hi Summer,

If you take a look to the CommitUpdateCommand class, you will notice no
Flag is in there.

// this is the toString for example

@Override
public String toString() {
  return super.toString() + ,optimize=+optimize
  +,openSearcher=+openSearcher
  +,waitSearcher=+waitSearcher
  +,expungeDeletes=+expungeDeletes
  +,softCommit=+softCommit
  +,prepareCommit=+prepareCommit
  +'}';
}


If you then access the UpdateCommand object, you find the flag :


public static int BUFFERING = 0x0001;// update command is
being buffered.
public static int REPLAY= 0x0002;// update command is from
replaying a log.
public static int PEER_SYNC= 0x0004; // update command is a
missing update being provided by a peer.
public static int IGNORE_AUTOCOMMIT = 0x0008; // this update
should not count toward triggering of autocommits.
public static int CLEAR_CACHES = 0x0010; // clear caches
associated with the update log.  used when applying reordered DBQ
updates when doing an add.

So the flag =2 is actually saying that the update command is from
replaying a log ( which is what you would expect)


Cheers


2015-07-08 3:01 GMT+01:00 Summer Shire shiresum...@gmail.com:


 Hi,

 When I restart my solr core the log replay starts and just before it
 finishes I see the following commit

 start
 commit{flags=2,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

 what does the “flags=2” param do ?

 when I try to send that param to the updateHandler manually solr does not
 like it

 curl http://localhost:6600/solr/main/update -H Content-Type: text/xml
 --data-binary 'commit  openSearcher=true flags=2
 waitSearcher=false/'

 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status400/intint
 name=QTime0/int/lstlst name=errorstr name=msgUnknown commit
 parameter 'flags'/strint name=code400/int/lst
 /response

 thanks,
 Summer




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Tlog replay

2015-07-08 Thread Summer Shire
Thanks Alessandro !

Any idea on why I couldn't curl the solr core and pass the flag param ?


 On Jul 8, 2015, at 7:12 AM, Alessandro Benedetti benedetti.ale...@gmail.com 
 wrote:
 
 Hi Summer,
 
 If you take a look to the CommitUpdateCommand class, you will notice no
 Flag is in there.
 
 // this is the toString for example
 
 @Override
 public String toString() {
  return super.toString() + ,optimize=+optimize
  +,openSearcher=+openSearcher
  +,waitSearcher=+waitSearcher
  +,expungeDeletes=+expungeDeletes
  +,softCommit=+softCommit
  +,prepareCommit=+prepareCommit
  +'}';
 }
 
 
 If you then access the UpdateCommand object, you find the flag :
 
 
 public static int BUFFERING = 0x0001;// update command is
 being buffered.
 public static int REPLAY= 0x0002;// update command is from
 replaying a log.
 public static int PEER_SYNC= 0x0004; // update command is a
 missing update being provided by a peer.
 public static int IGNORE_AUTOCOMMIT = 0x0008; // this update
 should not count toward triggering of autocommits.
 public static int CLEAR_CACHES = 0x0010; // clear caches
 associated with the update log.  used when applying reordered DBQ
 updates when doing an add.
 
 So the flag =2 is actually saying that the update command is from
 replaying a log ( which is what you would expect)
 
 
 Cheers
 
 
 2015-07-08 3:01 GMT+01:00 Summer Shire shiresum...@gmail.com:
 
 
 Hi,
 
 When I restart my solr core the log replay starts and just before it
 finishes I see the following commit
 
 start
 commit{flags=2,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
 
 what does the “flags=2” param do ?
 
 when I try to send that param to the updateHandler manually solr does not
 like it
 
 curl http://localhost:6600/solr/main/update -H Content-Type: text/xml
 --data-binary 'commit  openSearcher=true flags=2
 waitSearcher=false/'
 
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status400/intint
 name=QTime0/int/lstlst name=errorstr name=msgUnknown commit
 parameter 'flags'/strint name=code400/int/lst
 /response
 
 thanks,
 Summer
 
 
 
 
 -- 
 --
 
 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti
 
 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?
 
 William Blake - Songs of Experience -1794 England


Re: Tlog replay

2015-07-08 Thread Yonik Seeley
On Wed, Jul 8, 2015 at 12:31 PM, Summer Shire shiresum...@gmail.com wrote:
 Thanks Alessandro !

 Any idea on why I couldn't curl the solr core and pass the flag param ?

These flags are for internal use only.  Solr sets them, the client doesn't.

-Yonik