RE: CDCR - how to deal with the transaction log files

2017-07-28 Thread Xie, Sean
You don't need to start cdcr on target cluster. Other steps are exactly what I 
did. After disable buffer on both target and source, the tlog files are purged 
according to the specs.


-- Thank you
Sean

From: Patrick Hoeffel 
<patrick.hoef...@polarisalpha.com<mailto:patrick.hoef...@polarisalpha.com>>
Date: Friday, Jul 28, 2017, 4:01 PM
To: solr-user@lucene.apache.org 
<solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>>
Cc: jmy...@wayfair.com <jmy...@wayfair.com<mailto:jmy...@wayfair.com>>
Subject: [EXTERNAL] RE: CDCR - how to deal with the transaction log files

Amrit,

Problem solved! My biggest mistake was in my SOURCE-side configuration. The 
zkHost field needed the entire zkHost string, including the CHROOT indicator. I 
suppose that should have been obvious to me, but the examples only showed the 
IP Address of the target ZK, and I made a poor assumption.

  
  
  
10.161.0.7:2181,10.161.0.6:2181,10.161.0.5:2181/chroot/solr
ks_v1
ks_v1
  

  
  
  
10.161.0.7:2181 <=== Problem was here.
ks_v1
ks_v1
  


After that, I just made sure I did this:
1. Stop all Solr nodes at both SOURCE and TARGET.
2. $ rm -rf $SOLR_HOME/server/solr/collection_name/data/tlog/*.*
3. On the TARGET:
a. $ collection/cdcr?action=DISABLEBUFFER
b. $ collection/cdcr?action=START

4. On the Source:
a. $ collection/cdcr?action=DISABLEBUFFER
b. $ collection/cdcr?action=START

At this point any existing data in the SOURCE collection started flowing into 
the TARGET collection, and it has remained congruent ever since.

Thanks,



Patrick Hoeffel

Senior Software Engineer
(Direct)  719-452-7371
(Mobile) 719-210-3706
patrick.hoef...@polarisalpha.com
PolarisAlpha.com


-Original Message-
From: Amrit Sarkar [mailto:sarkaramr...@gmail.com]
Sent: Friday, July 21, 2017 7:21 AM
To: solr-user@lucene.apache.org
Cc: jmy...@wayfair.com
Subject: Re: CDCR - how to deal with the transaction log files

Patrick,

Yes! You created default UpdateLog which got written to a disk and then you 
changed it to CdcrUpdateLog in configs. I find no reason it would create a 
proper COLLECTIONCHECKPOINT on target tlog.

One thing you can try before creating / starting from scratch is restarting 
source cluster nodes, the leaders of shard will try to create the same 
COLLECTIONCHECKPOINT, which may or may not be successful.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com<http://www.lucidworks.com>
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Fri, Jul 21, 2017 at 11:09 AM, Patrick Hoeffel < 
patrick.hoef...@polarisalpha.com> wrote:

> I'm working on my first setup of CDCR, and I'm seeing the same "The
> log reader for target collection {collection name} is not initialised"
> as you saw.
>
> It looks like you're creating collections on a regular basis, but for
> me, I create it one time and never again. I've been creating the
> collection first from defaults and then applying the CDCR-aware
> solrconfig changes afterward. It sounds like maybe I need to create
> the configset in ZK first, then create the collections, first on the
> Target and then on the Source, and I should be good?
>
> Thanks,
>
> Patrick Hoeffel
> Senior Software Engineer
> (Direct)  719-452-7371
> (Mobile) 719-210-3706
> patrick.hoef...@polarisalpha.com
> PolarisAlpha.com
>
>
> -Original Message-
> From: jmyatt [mailto:jmy...@wayfair.com]
> Sent: Wednesday, July 12, 2017 4:49 PM
> To: solr-user@lucene.apache.org
> Subject: Re: CDCR - how to deal with the transaction log files
>
> glad to hear you found your solution!  I have been combing over this
> post and others on this discussion board many times and have tried so
> many tweaks to configuration, order of steps, etc, all with absolutely
> no success in getting the Source cluster tlogs to delete.  So
> incredibly frustrating.  If anyone has other pearls of wisdom I'd love some 
> advice.
> Quick hits on what I've tried:
>
> - solrconfig exactly like Sean's (target and source respectively)
> expect no autoSoftCommit
> - I am also calling cdcr?action=DISABLEBUFFER (on source as well as on
> target) explicitly before starting since the config setting of
> defaultState=disabled doesn't seem to work
> - when I create the collection on source first, I get the warning "The
> log reader for target collection {collection name} is not
> initialised".  When I reverse the order (create the collection on
> target first), no such warning
> - tlogs replicate as expected, hard commits on both target and source
> cause tlogs to rollover, etc - all of that works as expected
> - action=QUEUES on source reflects the queueSize accurately.  Also
> *always* shows updateLogSynchronizer st

RE: CDCR - how to deal with the transaction log files

2017-07-28 Thread Patrick Hoeffel
Amrit,

Problem solved! My biggest mistake was in my SOURCE-side configuration. The 
zkHost field needed the entire zkHost string, including the CHROOT indicator. I 
suppose that should have been obvious to me, but the examples only showed the 
IP Address of the target ZK, and I made a poor assumption.

  
  
  
10.161.0.7:2181,10.161.0.6:2181,10.161.0.5:2181/chroot/solr
ks_v1
ks_v1
  

  
  
  
10.161.0.7:2181 <=== Problem was here.
ks_v1
ks_v1
  


After that, I just made sure I did this:
1. Stop all Solr nodes at both SOURCE and TARGET.
2. $ rm -rf $SOLR_HOME/server/solr/collection_name/data/tlog/*.*
3. On the TARGET:
a. $ collection/cdcr?action=DISABLEBUFFER
b. $ collection/cdcr?action=START

4. On the Source:
a. $ collection/cdcr?action=DISABLEBUFFER
b. $ collection/cdcr?action=START

At this point any existing data in the SOURCE collection started flowing into 
the TARGET collection, and it has remained congruent ever since.

Thanks,



Patrick Hoeffel

Senior Software Engineer
(Direct)  719-452-7371
(Mobile) 719-210-3706
patrick.hoef...@polarisalpha.com
PolarisAlpha.com 


-Original Message-
From: Amrit Sarkar [mailto:sarkaramr...@gmail.com] 
Sent: Friday, July 21, 2017 7:21 AM
To: solr-user@lucene.apache.org
Cc: jmy...@wayfair.com
Subject: Re: CDCR - how to deal with the transaction log files

Patrick,

Yes! You created default UpdateLog which got written to a disk and then you 
changed it to CdcrUpdateLog in configs. I find no reason it would create a 
proper COLLECTIONCHECKPOINT on target tlog.

One thing you can try before creating / starting from scratch is restarting 
source cluster nodes, the leaders of shard will try to create the same 
COLLECTIONCHECKPOINT, which may or may not be successful.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Fri, Jul 21, 2017 at 11:09 AM, Patrick Hoeffel < 
patrick.hoef...@polarisalpha.com> wrote:

> I'm working on my first setup of CDCR, and I'm seeing the same "The 
> log reader for target collection {collection name} is not initialised" 
> as you saw.
>
> It looks like you're creating collections on a regular basis, but for 
> me, I create it one time and never again. I've been creating the 
> collection first from defaults and then applying the CDCR-aware 
> solrconfig changes afterward. It sounds like maybe I need to create 
> the configset in ZK first, then create the collections, first on the 
> Target and then on the Source, and I should be good?
>
> Thanks,
>
> Patrick Hoeffel
> Senior Software Engineer
> (Direct)  719-452-7371
> (Mobile) 719-210-3706
> patrick.hoef...@polarisalpha.com
> PolarisAlpha.com
>
>
> -Original Message-
> From: jmyatt [mailto:jmy...@wayfair.com]
> Sent: Wednesday, July 12, 2017 4:49 PM
> To: solr-user@lucene.apache.org
> Subject: Re: CDCR - how to deal with the transaction log files
>
> glad to hear you found your solution!  I have been combing over this 
> post and others on this discussion board many times and have tried so 
> many tweaks to configuration, order of steps, etc, all with absolutely 
> no success in getting the Source cluster tlogs to delete.  So 
> incredibly frustrating.  If anyone has other pearls of wisdom I'd love some 
> advice.
> Quick hits on what I've tried:
>
> - solrconfig exactly like Sean's (target and source respectively) 
> expect no autoSoftCommit
> - I am also calling cdcr?action=DISABLEBUFFER (on source as well as on
> target) explicitly before starting since the config setting of 
> defaultState=disabled doesn't seem to work
> - when I create the collection on source first, I get the warning "The 
> log reader for target collection {collection name} is not 
> initialised".  When I reverse the order (create the collection on 
> target first), no such warning
> - tlogs replicate as expected, hard commits on both target and source 
> cause tlogs to rollover, etc - all of that works as expected
> - action=QUEUES on source reflects the queueSize accurately.  Also
> *always* shows updateLogSynchronizer state as "stopped"
> - action=LASTPROCESSEDVERSION on both source and target always seems 
> correct (I don't see the -1 that Sean mentioned).
> - I'm creating new collections every time and running full data 
> imports that take 5-10 minutes. Again, all data replication, log 
> rollover, and autocommit activity seems to work as expected, and logs 
> on target are deleted.  It's just those pesky source tlogs I can't get to 
> delete.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/CDCR-how-to-deal-with-the-transaction-log-
> files-tp4345062p4345715.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: CDCR - how to deal with the transaction log files

2017-07-21 Thread Amrit Sarkar
Patrick,

Yes! You created default UpdateLog which got written to a disk and then you
changed it to CdcrUpdateLog in configs. I find no reason it would create a
proper COLLECTIONCHECKPOINT on target tlog.

One thing you can try before creating / starting from scratch is restarting
source cluster nodes, the leaders of shard will try to create the same
COLLECTIONCHECKPOINT, which may or may not be successful.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Fri, Jul 21, 2017 at 11:09 AM, Patrick Hoeffel <
patrick.hoef...@polarisalpha.com> wrote:

> I'm working on my first setup of CDCR, and I'm seeing the same "The log
> reader for target collection {collection name} is not initialised" as you
> saw.
>
> It looks like you're creating collections on a regular basis, but for me,
> I create it one time and never again. I've been creating the collection
> first from defaults and then applying the CDCR-aware solrconfig changes
> afterward. It sounds like maybe I need to create the configset in ZK first,
> then create the collections, first on the Target and then on the Source,
> and I should be good?
>
> Thanks,
>
> Patrick Hoeffel
> Senior Software Engineer
> (Direct)  719-452-7371
> (Mobile) 719-210-3706
> patrick.hoef...@polarisalpha.com
> PolarisAlpha.com
>
>
> -Original Message-
> From: jmyatt [mailto:jmy...@wayfair.com]
> Sent: Wednesday, July 12, 2017 4:49 PM
> To: solr-user@lucene.apache.org
> Subject: Re: CDCR - how to deal with the transaction log files
>
> glad to hear you found your solution!  I have been combing over this post
> and others on this discussion board many times and have tried so many
> tweaks to configuration, order of steps, etc, all with absolutely no
> success in getting the Source cluster tlogs to delete.  So incredibly
> frustrating.  If anyone has other pearls of wisdom I'd love some advice.
> Quick hits on what I've tried:
>
> - solrconfig exactly like Sean's (target and source respectively) expect
> no autoSoftCommit
> - I am also calling cdcr?action=DISABLEBUFFER (on source as well as on
> target) explicitly before starting since the config setting of
> defaultState=disabled doesn't seem to work
> - when I create the collection on source first, I get the warning "The log
> reader for target collection {collection name} is not initialised".  When I
> reverse the order (create the collection on target first), no such warning
> - tlogs replicate as expected, hard commits on both target and source
> cause tlogs to rollover, etc - all of that works as expected
> - action=QUEUES on source reflects the queueSize accurately.  Also
> *always* shows updateLogSynchronizer state as "stopped"
> - action=LASTPROCESSEDVERSION on both source and target always seems
> correct (I don't see the -1 that Sean mentioned).
> - I'm creating new collections every time and running full data imports
> that take 5-10 minutes. Again, all data replication, log rollover, and
> autocommit activity seems to work as expected, and logs on target are
> deleted.  It's just those pesky source tlogs I can't get to delete.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/CDCR-how-to-deal-with-the-transaction-log-
> files-tp4345062p4345715.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


RE: CDCR - how to deal with the transaction log files

2017-07-20 Thread Patrick Hoeffel
I'm working on my first setup of CDCR, and I'm seeing the same "The log reader 
for target collection {collection name} is not initialised" as you saw.

It looks like you're creating collections on a regular basis, but for me, I 
create it one time and never again. I've been creating the collection first 
from defaults and then applying the CDCR-aware solrconfig changes afterward. It 
sounds like maybe I need to create the configset in ZK first, then create the 
collections, first on the Target and then on the Source, and I should be good?

Thanks,

Patrick Hoeffel

Senior Software Engineer
(Direct)  719-452-7371
(Mobile) 719-210-3706
patrick.hoef...@polarisalpha.com
PolarisAlpha.com 


-Original Message-
From: jmyatt [mailto:jmy...@wayfair.com] 
Sent: Wednesday, July 12, 2017 4:49 PM
To: solr-user@lucene.apache.org
Subject: Re: CDCR - how to deal with the transaction log files

glad to hear you found your solution!  I have been combing over this post and 
others on this discussion board many times and have tried so many tweaks to 
configuration, order of steps, etc, all with absolutely no success in getting 
the Source cluster tlogs to delete.  So incredibly frustrating.  If anyone has 
other pearls of wisdom I'd love some advice.  Quick hits on what I've tried:

- solrconfig exactly like Sean's (target and source respectively) expect no 
autoSoftCommit
- I am also calling cdcr?action=DISABLEBUFFER (on source as well as on
target) explicitly before starting since the config setting of 
defaultState=disabled doesn't seem to work
- when I create the collection on source first, I get the warning "The log 
reader for target collection {collection name} is not initialised".  When I 
reverse the order (create the collection on target first), no such warning
- tlogs replicate as expected, hard commits on both target and source cause 
tlogs to rollover, etc - all of that works as expected
- action=QUEUES on source reflects the queueSize accurately.  Also *always* 
shows updateLogSynchronizer state as "stopped"
- action=LASTPROCESSEDVERSION on both source and target always seems correct (I 
don't see the -1 that Sean mentioned).
- I'm creating new collections every time and running full data imports that 
take 5-10 minutes. Again, all data replication, log rollover, and autocommit 
activity seems to work as expected, and logs on target are deleted.  It's just 
those pesky source tlogs I can't get to delete.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CDCR-how-to-deal-with-the-transaction-log-files-tp4345062p4345715.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: CDCR - how to deal with the transaction log files

2017-07-17 Thread Susheel Kumar
I just voted for https://issues.apache.org/jira/browse/SOLR-11069 to get it
resolved, as we are discussing to start using CDCR soon.

On Fri, Jul 14, 2017 at 5:21 PM, Varun Thacker  wrote:

> https://issues.apache.org/jira/browse/SOLR-11069 is tracking why is
> LASTPROCESSEDVERSION=-1
> on the source cluster always
>
> On Fri, Jul 14, 2017 at 11:46 AM, jmyatt  wrote:
>
> > Thanks for the suggestion - tried that today and still no luck.  Time to
> > write a script to naively / blindly delete old logs and run that in cron.
> > *sigh*
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> > nabble.com/CDCR-how-to-deal-with-the-transaction-log-
> > files-tp4345062p4346138.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>


Re: CDCR - how to deal with the transaction log files

2017-07-14 Thread Varun Thacker
https://issues.apache.org/jira/browse/SOLR-11069 is tracking why is
LASTPROCESSEDVERSION=-1
on the source cluster always

On Fri, Jul 14, 2017 at 11:46 AM, jmyatt  wrote:

> Thanks for the suggestion - tried that today and still no luck.  Time to
> write a script to naively / blindly delete old logs and run that in cron.
> *sigh*
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/CDCR-how-to-deal-with-the-transaction-log-
> files-tp4345062p4346138.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: CDCR - how to deal with the transaction log files

2017-07-14 Thread jmyatt
Thanks for the suggestion - tried that today and still no luck.  Time to
write a script to naively / blindly delete old logs and run that in cron.
*sigh*



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CDCR-how-to-deal-with-the-transaction-log-files-tp4345062p4346138.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: CDCR - how to deal with the transaction log files

2017-07-12 Thread Xie, Sean
Try run second data import or any other indexing jobs after the replication of 
the first data import is completed.

My observation is during the replication period (when there is docs in queue), 
tlog clean up will not triggered. So when queue is 0, and submit second batch 
and monitor the queue and tlogs again.

-- Thank you
Sean

From: jmyatt <jmy...@wayfair.com<mailto:jmy...@wayfair.com>>
Date: Wednesday, Jul 12, 2017, 6:58 PM
To: solr-user@lucene.apache.org 
<solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>>
Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log files

glad to hear you found your solution!  I have been combing over this post and
others on this discussion board many times and have tried so many tweaks to
configuration, order of steps, etc, all with absolutely no success in
getting the Source cluster tlogs to delete.  So incredibly frustrating.  If
anyone has other pearls of wisdom I'd love some advice.  Quick hits on what
I've tried:

- solrconfig exactly like Sean's (target and source respectively) expect no
autoSoftCommit
- I am also calling cdcr?action=DISABLEBUFFER (on source as well as on
target) explicitly before starting since the config setting of
defaultState=disabled doesn't seem to work
- when I create the collection on source first, I get the warning "The log
reader for target collection {collection name} is not initialised".  When I
reverse the order (create the collection on target first), no such warning
- tlogs replicate as expected, hard commits on both target and source cause
tlogs to rollover, etc - all of that works as expected
- action=QUEUES on source reflects the queueSize accurately.  Also *always*
shows updateLogSynchronizer state as "stopped"
- action=LASTPROCESSEDVERSION on both source and target always seems correct
(I don't see the -1 that Sean mentioned).
- I'm creating new collections every time and running full data imports that
take 5-10 minutes. Again, all data replication, log rollover, and autocommit
activity seems to work as expected, and logs on target are deleted.  It's
just those pesky source tlogs I can't get to delete.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CDCR-how-to-deal-with-the-transaction-log-files-tp4345062p4345715.html
Sent from the Solr - User mailing list archive at Nabble.com.

Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.


Re: CDCR - how to deal with the transaction log files

2017-07-12 Thread jmyatt
glad to hear you found your solution!  I have been combing over this post and
others on this discussion board many times and have tried so many tweaks to
configuration, order of steps, etc, all with absolutely no success in
getting the Source cluster tlogs to delete.  So incredibly frustrating.  If
anyone has other pearls of wisdom I'd love some advice.  Quick hits on what
I've tried:

- solrconfig exactly like Sean's (target and source respectively) expect no
autoSoftCommit
- I am also calling cdcr?action=DISABLEBUFFER (on source as well as on
target) explicitly before starting since the config setting of
defaultState=disabled doesn't seem to work
- when I create the collection on source first, I get the warning "The log
reader for target collection {collection name} is not initialised".  When I
reverse the order (create the collection on target first), no such warning
- tlogs replicate as expected, hard commits on both target and source cause
tlogs to rollover, etc - all of that works as expected
- action=QUEUES on source reflects the queueSize accurately.  Also *always*
shows updateLogSynchronizer state as "stopped"
- action=LASTPROCESSEDVERSION on both source and target always seems correct
(I don't see the -1 that Sean mentioned).
- I'm creating new collections every time and running full data imports that
take 5-10 minutes. Again, all data replication, log rollover, and autocommit
activity seems to work as expected, and logs on target are deleted.  It's
just those pesky source tlogs I can't get to delete.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CDCR-how-to-deal-with-the-transaction-log-files-tp4345062p4345715.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: CDCR - how to deal with the transaction log files

2017-07-10 Thread Xie, Sean
My guess was the documentation gap.

I did a testing that turning off the CDCR by using action=stop, while 
continuously sending documents to the source cluster. The tlog files were 
growing; And after the hard commit, a new tlog file was created and the old 
files stayed there forever. As soon as I turned on CDCR, the documents started 
to replicate to the target. 

After a hard commit and scheduled log synchronizer run, the old tlog files got 
deleted.

Btw, I’m running on 6.5.1.



On 7/10/17, 10:57 PM, "Varun Thacker" <va...@vthacker.in> wrote:

Yeah it just seems weird that you would need to disable the buffer on the
source cluster though.

The docs say "Replicas do not need to buffer updates, and it is recommended
to disable buffer on the target SolrCloud" which means the source should
have it enabled.

But the fact that it's working for you proves otherwise . What version of
Solr are you running? I'll try reproducing this problem at my end and see
if it's a documentation gap or a bug.

On Mon, Jul 10, 2017 at 7:15 PM, Xie, Sean <sean@finra.org> wrote:

> Yes. Documents are being sent to target. Monitoring the output from
> “action=queues”, depending your settings, you will see the documents
> replication progress.
>
> On the other hand, if enable the buffer, the lastprocessedversion is
> always returning -1. Reading the source code, the CdcrUpdateLogSynchroizer
> does not continue to do the clean if this value is -1.
>
> Sean
>
> On 7/10/17, 5:18 PM, "Varun Thacker" <va...@vthacker.in> wrote:
>
> After disabling the buffer are you still seeing documents being
> replicated
> to the target cluster(s) ?
>
> On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean <sean@finra.org> wrote:
>
> > After several experiments and observation, finally make it work.
> > The key point is you have to also disablebuffer on source cluster. I
> don’t
> > know why in the wiki, it didn’t mention it, but I figured this out
> through
> > the source code.
> > Once disablebuffer on source cluster, the lastProcessedVersion will
> become
> > a position number, and when there is hard commit, the old unused
> tlog files
> > get deleted.
> >
> > Hope my finding can help other users who experience the same issue.
> >
> >
> > On 7/10/17, 9:08 AM, "Michael McCarthy" <michael.mccar...@gm.com>
> wrote:
> >
> > We have been experiencing this same issue for months now, with
> version
> > 6.2.  No solution to date.
> >
    >     >     -Original Message-
    > > From: Xie, Sean [mailto:sean@finra.org]
> > Sent: Sunday, July 09, 2017 9:41 PM
> > To: solr-user@lucene.apache.org
> > Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction
> log
> > files
> >
> > Did another round of testing, the tlog on target cluster is
> cleaned up
> > once the hard commit is triggered. However, on source cluster, the
> tlog
> > files stay there and never gets cleaned up.
> >
> > Not sure if there is any command to run manually to trigger the
> > updateLogSynchronizer. The updateLogSynchronizer already set at run
> at
> > every 10 seconds, but seems it didn’t help.
> >
> > Any help?
> >
> > Thanks
> > Sean
> >
> > On 7/8/17, 1:14 PM, "Xie, Sean" <sean@finra.org> wrote:
> >
> > I have monitored the CDCR process for a while, the updates
> are
> > actively sent to the target without a problem. However the tlog size
> and
> > files count are growing everyday, even when there is 0 updates to
> sent, the
> > tlog stays there:
> >
> > Following is from the action=queues command, and you can see
> after
> > about a month or so running days, the total transaction are reaching
> to
> > 140K total files, and size is about 103G.
> >
> > 
> > 
> > 0
> > 465
> > 
> > 
> > 
> > 
> > 0

Re: CDCR - how to deal with the transaction log files

2017-07-10 Thread Varun Thacker
Yeah it just seems weird that you would need to disable the buffer on the
source cluster though.

The docs say "Replicas do not need to buffer updates, and it is recommended
to disable buffer on the target SolrCloud" which means the source should
have it enabled.

But the fact that it's working for you proves otherwise . What version of
Solr are you running? I'll try reproducing this problem at my end and see
if it's a documentation gap or a bug.

On Mon, Jul 10, 2017 at 7:15 PM, Xie, Sean <sean@finra.org> wrote:

> Yes. Documents are being sent to target. Monitoring the output from
> “action=queues”, depending your settings, you will see the documents
> replication progress.
>
> On the other hand, if enable the buffer, the lastprocessedversion is
> always returning -1. Reading the source code, the CdcrUpdateLogSynchroizer
> does not continue to do the clean if this value is -1.
>
> Sean
>
> On 7/10/17, 5:18 PM, "Varun Thacker" <va...@vthacker.in> wrote:
>
> After disabling the buffer are you still seeing documents being
> replicated
> to the target cluster(s) ?
>
> On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean <sean@finra.org> wrote:
>
> > After several experiments and observation, finally make it work.
> > The key point is you have to also disablebuffer on source cluster. I
> don’t
> > know why in the wiki, it didn’t mention it, but I figured this out
> through
> > the source code.
> > Once disablebuffer on source cluster, the lastProcessedVersion will
> become
> > a position number, and when there is hard commit, the old unused
> tlog files
> > get deleted.
> >
> > Hope my finding can help other users who experience the same issue.
> >
> >
> > On 7/10/17, 9:08 AM, "Michael McCarthy" <michael.mccar...@gm.com>
> wrote:
> >
> > We have been experiencing this same issue for months now, with
> version
> > 6.2.  No solution to date.
> >
>     >     -Original Message-----
> >     From: Xie, Sean [mailto:sean@finra.org]
> > Sent: Sunday, July 09, 2017 9:41 PM
> > To: solr-user@lucene.apache.org
> > Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction
> log
> > files
> >
> > Did another round of testing, the tlog on target cluster is
> cleaned up
> > once the hard commit is triggered. However, on source cluster, the
> tlog
> > files stay there and never gets cleaned up.
> >
> > Not sure if there is any command to run manually to trigger the
> > updateLogSynchronizer. The updateLogSynchronizer already set at run
> at
> > every 10 seconds, but seems it didn’t help.
> >
> > Any help?
> >
> > Thanks
> > Sean
> >
> > On 7/8/17, 1:14 PM, "Xie, Sean" <sean@finra.org> wrote:
> >
> > I have monitored the CDCR process for a while, the updates
> are
> > actively sent to the target without a problem. However the tlog size
> and
> > files count are growing everyday, even when there is 0 updates to
> sent, the
> > tlog stays there:
> >
> > Following is from the action=queues command, and you can see
> after
> > about a month or so running days, the total transaction are reaching
> to
> > 140K total files, and size is about 103G.
> >
> > 
> > 
> > 0
> > 465
> > 
> > 
> > 
> > 
> > 0
> > 2017-07-07T23:19:09.655Z
> > 
> > 
> > 
> > 102740042616
> > 140809
> > stopped
> > 
> >
> > Any help on it? Or do I need to configure something else?
> The CDCR
> > configuration is pretty much following the wiki:
> >
> > On target:
> >
> >   
> > 
> >   disabled
> > 
> >   
> >
> >   
> > 
> > 
> >   
> >
> >   
> > 
> >   cdcr-processor-chain
> > 
> >   
> >
> >   
> > 
>   

Re: CDCR - how to deal with the transaction log files

2017-07-10 Thread Xie, Sean
Yes. Documents are being sent to target. Monitoring the output from 
“action=queues”, depending your settings, you will see the documents 
replication progress.

On the other hand, if enable the buffer, the lastprocessedversion is always 
returning -1. Reading the source code, the CdcrUpdateLogSynchroizer does not 
continue to do the clean if this value is -1.

Sean

On 7/10/17, 5:18 PM, "Varun Thacker" <va...@vthacker.in> wrote:

After disabling the buffer are you still seeing documents being replicated
to the target cluster(s) ?

On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean <sean@finra.org> wrote:

> After several experiments and observation, finally make it work.
> The key point is you have to also disablebuffer on source cluster. I don’t
> know why in the wiki, it didn’t mention it, but I figured this out through
> the source code.
> Once disablebuffer on source cluster, the lastProcessedVersion will become
> a position number, and when there is hard commit, the old unused tlog 
files
> get deleted.
>
> Hope my finding can help other users who experience the same issue.
>
>
> On 7/10/17, 9:08 AM, "Michael McCarthy" <michael.mccar...@gm.com> wrote:
>
> We have been experiencing this same issue for months now, with version
> 6.2.  No solution to date.
>
> -Original Message-
> From: Xie, Sean [mailto:sean@finra.org]
> Sent: Sunday, July 09, 2017 9:41 PM
> To: solr-user@lucene.apache.org
> Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log
> files
>
> Did another round of testing, the tlog on target cluster is cleaned up
> once the hard commit is triggered. However, on source cluster, the tlog
> files stay there and never gets cleaned up.
>
> Not sure if there is any command to run manually to trigger the
> updateLogSynchronizer. The updateLogSynchronizer already set at run at
> every 10 seconds, but seems it didn’t help.
>
> Any help?
>
> Thanks
> Sean
>
> On 7/8/17, 1:14 PM, "Xie, Sean" <sean@finra.org> wrote:
>
> I have monitored the CDCR process for a while, the updates are
> actively sent to the target without a problem. However the tlog size and
> files count are growing everyday, even when there is 0 updates to sent, 
the
> tlog stays there:
>
> Following is from the action=queues command, and you can see after
> about a month or so running days, the total transaction are reaching to
> 140K total files, and size is about 103G.
>
> 
> 
> 0
> 465
> 
> 
> 
> 
> 0
> 2017-07-07T23:19:09.655Z
> 
> 
> 
> 102740042616
> 140809
> stopped
> 
>
> Any help on it? Or do I need to configure something else? The CDCR
> configuration is pretty much following the wiki:
>
> On target:
>
>   
> 
>   disabled
> 
>   
>
>   
> 
> 
>   
>
>   
> 
>   cdcr-processor-chain
> 
>   
>
>   
> 
>   ${solr.ulog.dir:}
> 
> 
>   ${solr.autoCommit.maxTime:18}
>   false
> 
>
> 
>   ${solr.autoSoftCommit.maxTime:3}
> 
>   
>
> On source:
>   
> 
>   ${TargetZk}
>   MY_COLLECTION
>   MY_COLLECTION
> 
>
> 
>   1
>   1000
>   128
> 
>
> 
>   6
> 
>   
>
>   
> 
>   ${solr.ulog.dir:}
> 
> 
>   ${solr.autoCommit.maxTime:18}
>   false
> 
>
> 
>   ${solr.autoSoftCommit.maxTime:3}
> 
>   
>
  

Re: CDCR - how to deal with the transaction log files

2017-07-10 Thread Varun Thacker
After disabling the buffer are you still seeing documents being replicated
to the target cluster(s) ?

On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean <sean@finra.org> wrote:

> After several experiments and observation, finally make it work.
> The key point is you have to also disablebuffer on source cluster. I don’t
> know why in the wiki, it didn’t mention it, but I figured this out through
> the source code.
> Once disablebuffer on source cluster, the lastProcessedVersion will become
> a position number, and when there is hard commit, the old unused tlog files
> get deleted.
>
> Hope my finding can help other users who experience the same issue.
>
>
> On 7/10/17, 9:08 AM, "Michael McCarthy" <michael.mccar...@gm.com> wrote:
>
> We have been experiencing this same issue for months now, with version
> 6.2.  No solution to date.
>
> -Original Message-
> From: Xie, Sean [mailto:sean@finra.org]
> Sent: Sunday, July 09, 2017 9:41 PM
>     To: solr-user@lucene.apache.org
> Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log
> files
>
> Did another round of testing, the tlog on target cluster is cleaned up
> once the hard commit is triggered. However, on source cluster, the tlog
> files stay there and never gets cleaned up.
>
> Not sure if there is any command to run manually to trigger the
> updateLogSynchronizer. The updateLogSynchronizer already set at run at
> every 10 seconds, but seems it didn’t help.
>
> Any help?
>
> Thanks
> Sean
>
> On 7/8/17, 1:14 PM, "Xie, Sean" <sean@finra.org> wrote:
>
> I have monitored the CDCR process for a while, the updates are
> actively sent to the target without a problem. However the tlog size and
> files count are growing everyday, even when there is 0 updates to sent, the
> tlog stays there:
>
> Following is from the action=queues command, and you can see after
> about a month or so running days, the total transaction are reaching to
> 140K total files, and size is about 103G.
>
> 
> 
> 0
> 465
> 
> 
> 
> 
> 0
> 2017-07-07T23:19:09.655Z
> 
> 
> 
> 102740042616
> 140809
> stopped
> 
>
> Any help on it? Or do I need to configure something else? The CDCR
> configuration is pretty much following the wiki:
>
> On target:
>
>   
> 
>   disabled
> 
>   
>
>   
> 
> 
>   
>
>   
> 
>   cdcr-processor-chain
> 
>   
>
>   
> 
>   ${solr.ulog.dir:}
> 
> 
>   ${solr.autoCommit.maxTime:18}
>   false
> 
>
> 
>   ${solr.autoSoftCommit.maxTime:3}
> 
>   
>
> On source:
>   
> 
>   ${TargetZk}
>   MY_COLLECTION
>   MY_COLLECTION
> 
>
> 
>   1
>   1000
>   128
> 
>
> 
>   6
> 
>   
>
>   
> 
>   ${solr.ulog.dir:}
> 
> 
>   ${solr.autoCommit.maxTime:18}
>   false
> 
>
> 
>   ${solr.autoSoftCommit.maxTime:3}
> 
>   
>
> Thanks.
> Sean
>
> On 7/8/17, 12:10 PM, "Erick Erickson" <erickerick...@gmail.com>
> wrote:
>
> This should not be the case if you are actively sending
> updates to the
> target cluster. The tlog is used to store unsent updates, so
> if the
> connection is broken for some time, the target cluster will
> have a
> chance to catch up.
>
> If you don't have the remote DC online and do not intend to
> bring it
> online soon, you should turn CDCR off.
>
> Best,
> Erick
>
> On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <sean@finra.org>
> wrote:
> > Once enabled CDCR, update log stores an unlimited number of
> entries. This is causing the tlog folder getting bigger and bigger, as well
> as the open files are growing. How can one reduce the number of open files
&

RE: CDCR - how to deal with the transaction log files

2017-07-10 Thread Xie, Sean
After several experiments and observation, finally make it work. 
The key point is you have to also disablebuffer on source cluster. I don’t know 
why in the wiki, it didn’t mention it, but I figured this out through the 
source code. 
Once disablebuffer on source cluster, the lastProcessedVersion will become a 
position number, and when there is hard commit, the old unused tlog files get 
deleted.

Hope my finding can help other users who experience the same issue.


On 7/10/17, 9:08 AM, "Michael McCarthy" <michael.mccar...@gm.com> wrote:

We have been experiencing this same issue for months now, with version 6.2. 
 No solution to date.

-Original Message-
From: Xie, Sean [mailto:sean@finra.org]
Sent: Sunday, July 09, 2017 9:41 PM
To: solr-user@lucene.apache.org
Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log files

Did another round of testing, the tlog on target cluster is cleaned up once 
the hard commit is triggered. However, on source cluster, the tlog files stay 
there and never gets cleaned up.

Not sure if there is any command to run manually to trigger the 
updateLogSynchronizer. The updateLogSynchronizer already set at run at every 10 
seconds, but seems it didn’t help.

Any help?

Thanks
Sean

On 7/8/17, 1:14 PM, "Xie, Sean" <sean@finra.org> wrote:

I have monitored the CDCR process for a while, the updates are actively 
sent to the target without a problem. However the tlog size and files count are 
growing everyday, even when there is 0 updates to sent, the tlog stays there:

Following is from the action=queues command, and you can see after 
about a month or so running days, the total transaction are reaching to 140K 
total files, and size is about 103G.



0
465




0
2017-07-07T23:19:09.655Z



102740042616
140809
stopped


Any help on it? Or do I need to configure something else? The CDCR 
configuration is pretty much following the wiki:

On target:

  

  disabled

  

  


  

  

  cdcr-processor-chain

  

  

  ${solr.ulog.dir:}


  ${solr.autoCommit.maxTime:18}
  false



  ${solr.autoSoftCommit.maxTime:3}

  

On source:
  

  ${TargetZk}
  MY_COLLECTION
  MY_COLLECTION



  1
  1000
  128



  6

  

  

  ${solr.ulog.dir:}


  ${solr.autoCommit.maxTime:18}
  false



  ${solr.autoSoftCommit.maxTime:3}

  

Thanks.
Sean

On 7/8/17, 12:10 PM, "Erick Erickson" <erickerick...@gmail.com> wrote:

This should not be the case if you are actively sending updates to 
the
target cluster. The tlog is used to store unsent updates, so if the
connection is broken for some time, the target cluster will have a
chance to catch up.

If you don't have the remote DC online and do not intend to bring it
online soon, you should turn CDCR off.

Best,
Erick

On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <sean@finra.org> 
wrote:
> Once enabled CDCR, update log stores an unlimited number of 
entries. This is causing the tlog folder getting bigger and bigger, as well as 
the open files are growing. How can one reduce the number of open files and 
also to reduce the tlog files? If it’s not taken care properly, sooner or later 
the log files size and open file count will exceed the limits.
>
> Thanks
> Sean
>
>
> Confidentiality Notice::  This email, including attachments, may 
include non-public, proprietary, confidential or legally privileged 
information.  If you are not an intended recipient or an authorized agent of an 
intended recipient, you are hereby notified that any dissemination, 
distribution or copying of the information contained in or transmitted with 
this e-mail is unauthorized and strictly prohibited.  If you have received this 
email in error, please notify the sender by replying to this message and 
permanently delete this e-mail, its attachme

RE: CDCR - how to deal with the transaction log files

2017-07-10 Thread Xie, Sean
Did some source code reading, and looks like when lastProcessedVersion==-1, 
then it will do nothing:

https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/CdcrUpdateLogSynchronizer.java

// if we received -1, it means that the log reader on the leader has 
not yet started to read log entries
// do nothing
if (lastVersion == -1) {
  return;
}

So I queried the solr to find out, and here is the results:

/cdcr?action=LASTPROCESSEDVERSION



0
0

-1


Anything could cause this issue to happen?


Sean


On 7/10/17, 9:08 AM, "Michael McCarthy" <michael.mccar...@gm.com> wrote:

We have been experiencing this same issue for months now, with version 6.2. 
 No solution to date.

-Original Message-
From: Xie, Sean [mailto:sean@finra.org]
Sent: Sunday, July 09, 2017 9:41 PM
To: solr-user@lucene.apache.org
Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log files

Did another round of testing, the tlog on target cluster is cleaned up once 
the hard commit is triggered. However, on source cluster, the tlog files stay 
there and never gets cleaned up.

Not sure if there is any command to run manually to trigger the 
updateLogSynchronizer. The updateLogSynchronizer already set at run at every 10 
seconds, but seems it didn’t help.

Any help?

Thanks
Sean

On 7/8/17, 1:14 PM, "Xie, Sean" <sean@finra.org> wrote:

I have monitored the CDCR process for a while, the updates are actively 
sent to the target without a problem. However the tlog size and files count are 
growing everyday, even when there is 0 updates to sent, the tlog stays there:

Following is from the action=queues command, and you can see after 
about a month or so running days, the total transaction are reaching to 140K 
total files, and size is about 103G.



0
465




0
2017-07-07T23:19:09.655Z



102740042616
140809
stopped


Any help on it? Or do I need to configure something else? The CDCR 
configuration is pretty much following the wiki:

On target:

  

  disabled

  

  


  

  

  cdcr-processor-chain

  

  

  ${solr.ulog.dir:}


  ${solr.autoCommit.maxTime:18}
  false



  ${solr.autoSoftCommit.maxTime:3}

  

On source:
  

  ${TargetZk}
  MY_COLLECTION
  MY_COLLECTION



  1
  1000
  128



  6

  

  

  ${solr.ulog.dir:}


  ${solr.autoCommit.maxTime:18}
  false



  ${solr.autoSoftCommit.maxTime:3}

  

Thanks.
Sean

On 7/8/17, 12:10 PM, "Erick Erickson" <erickerick...@gmail.com> wrote:

This should not be the case if you are actively sending updates to 
the
target cluster. The tlog is used to store unsent updates, so if the
connection is broken for some time, the target cluster will have a
chance to catch up.

If you don't have the remote DC online and do not intend to bring it
online soon, you should turn CDCR off.

Best,
Erick

On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <sean@finra.org> 
wrote:
> Once enabled CDCR, update log stores an unlimited number of 
entries. This is causing the tlog folder getting bigger and bigger, as well as 
the open files are growing. How can one reduce the number of open files and 
also to reduce the tlog files? If it’s not taken care properly, sooner or later 
the log files size and open file count will exceed the limits.
>
> Thanks
> Sean
>
>
> Confidentiality Notice::  This email, including attachments, may 
include non-public, proprietary, confidential or legally privileged 
information.  If you are not an intended recipient or an authorized agent of an 
intended recipient, you are hereby notified that any dissemination, 
distribution or copying of the information contained in or transmitted with 
this e-mail is unauthorized and strictly prohibited.  If you have received

RE: CDCR - how to deal with the transaction log files

2017-07-10 Thread Michael McCarthy
We have been experiencing this same issue for months now, with version 6.2.  No 
solution to date.

-Original Message-
From: Xie, Sean [mailto:sean@finra.org]
Sent: Sunday, July 09, 2017 9:41 PM
To: solr-user@lucene.apache.org
Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log files

Did another round of testing, the tlog on target cluster is cleaned up once the 
hard commit is triggered. However, on source cluster, the tlog files stay there 
and never gets cleaned up.

Not sure if there is any command to run manually to trigger the 
updateLogSynchronizer. The updateLogSynchronizer already set at run at every 10 
seconds, but seems it didn’t help.

Any help?

Thanks
Sean

On 7/8/17, 1:14 PM, "Xie, Sean" <sean@finra.org> wrote:

I have monitored the CDCR process for a while, the updates are actively 
sent to the target without a problem. However the tlog size and files count are 
growing everyday, even when there is 0 updates to sent, the tlog stays there:

Following is from the action=queues command, and you can see after about a 
month or so running days, the total transaction are reaching to 140K total 
files, and size is about 103G.



0
465




0
2017-07-07T23:19:09.655Z



102740042616
140809
stopped


Any help on it? Or do I need to configure something else? The CDCR 
configuration is pretty much following the wiki:

On target:

  

  disabled

  

  


  

  

  cdcr-processor-chain

  

  

  ${solr.ulog.dir:}


  ${solr.autoCommit.maxTime:18}
  false



  ${solr.autoSoftCommit.maxTime:3}

  

On source:
  

  ${TargetZk}
  MY_COLLECTION
  MY_COLLECTION



  1
  1000
  128



  6

  

  

  ${solr.ulog.dir:}


  ${solr.autoCommit.maxTime:18}
  false



  ${solr.autoSoftCommit.maxTime:3}

  

Thanks.
Sean

On 7/8/17, 12:10 PM, "Erick Erickson" <erickerick...@gmail.com> wrote:

This should not be the case if you are actively sending updates to the
target cluster. The tlog is used to store unsent updates, so if the
connection is broken for some time, the target cluster will have a
chance to catch up.

If you don't have the remote DC online and do not intend to bring it
online soon, you should turn CDCR off.

Best,
Erick

On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <sean@finra.org> wrote:
> Once enabled CDCR, update log stores an unlimited number of entries. 
This is causing the tlog folder getting bigger and bigger, as well as the open 
files are growing. How can one reduce the number of open files and also to 
reduce the tlog files? If it’s not taken care properly, sooner or later the log 
files size and open file count will exceed the limits.
>
> Thanks
> Sean
>
>
> Confidentiality Notice::  This email, including attachments, may 
include non-public, proprietary, confidential or legally privileged 
information.  If you are not an intended recipient or an authorized agent of an 
intended recipient, you are hereby notified that any dissemination, 
distribution or copying of the information contained in or transmitted with 
this e-mail is unauthorized and strictly prohibited.  If you have received this 
email in error, please notify the sender by replying to this message and 
permanently delete this e-mail, its attachments, and any copies of it 
immediately.  You should not retain, copy or use this e-mail or any attachment 
for any purpose, nor disclose all or any part of the contents to any other 
person. Thank you.






Nothing in this message is intended to constitute an electronic signature 
unless a specific statement to the contrary is included in this message.

Confidentiality Note: This message is intended only for the person or entity to 
which it is addressed. It may contain confidential and/or privileged material. 
Any review, transmission, dissemination or other use, or taking of any action 
in reliance upon this message by persons or entities other than the intended 
recipient is prohibited and may be unlawful. If you received this message in 
error, please contact the sender and delete it from your computer.


Re: CDCR - how to deal with the transaction log files

2017-07-09 Thread Xie, Sean
Did another round of testing, the tlog on target cluster is cleaned up once the 
hard commit is triggered. However, on source cluster, the tlog files stay there 
and never gets cleaned up.

Not sure if there is any command to run manually to trigger the 
updateLogSynchronizer. The updateLogSynchronizer already set at run at every 10 
seconds, but seems it didn’t help.

Any help?

Thanks
Sean

On 7/8/17, 1:14 PM, "Xie, Sean"  wrote:

I have monitored the CDCR process for a while, the updates are actively 
sent to the target without a problem. However the tlog size and files count are 
growing everyday, even when there is 0 updates to sent, the tlog stays there:

Following is from the action=queues command, and you can see after about a 
month or so running days, the total transaction are reaching to 140K total 
files, and size is about 103G.



0
465




0
2017-07-07T23:19:09.655Z



102740042616
140809
stopped


Any help on it? Or do I need to configure something else? The CDCR 
configuration is pretty much following the wiki:

On target:

  

  disabled

  

  


  

  

  cdcr-processor-chain

  

  

  ${solr.ulog.dir:}

 
  ${solr.autoCommit.maxTime:18}
  false 


 
  ${solr.autoSoftCommit.maxTime:3}
 
  

On source:
  

  ${TargetZk}
  MY_COLLECTION
  MY_COLLECTION



  1
  1000
  128



  6

  

  

  ${solr.ulog.dir:}

 
  ${solr.autoCommit.maxTime:18}
  false 


 
  ${solr.autoSoftCommit.maxTime:3}
 
  

Thanks.
Sean

On 7/8/17, 12:10 PM, "Erick Erickson"  wrote:

This should not be the case if you are actively sending updates to the
target cluster. The tlog is used to store unsent updates, so if the
connection is broken for some time, the target cluster will have a
chance to catch up.

If you don't have the remote DC online and do not intend to bring it
online soon, you should turn CDCR off.

Best,
Erick

On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean  wrote:
> Once enabled CDCR, update log stores an unlimited number of entries. 
This is causing the tlog folder getting bigger and bigger, as well as the open 
files are growing. How can one reduce the number of open files and also to 
reduce the tlog files? If it’s not taken care properly, sooner or later the log 
files size and open file count will exceed the limits.
>
> Thanks
> Sean
>
>
> Confidentiality Notice::  This email, including attachments, may 
include non-public, proprietary, confidential or legally privileged 
information.  If you are not an intended recipient or an authorized agent of an 
intended recipient, you are hereby notified that any dissemination, 
distribution or copying of the information contained in or transmitted with 
this e-mail is unauthorized and strictly prohibited.  If you have received this 
email in error, please notify the sender by replying to this message and 
permanently delete this e-mail, its attachments, and any copies of it 
immediately.  You should not retain, copy or use this e-mail or any attachment 
for any purpose, nor disclose all or any part of the contents to any other 
person. Thank you.






Re: CDCR - how to deal with the transaction log files

2017-07-08 Thread Xie, Sean
I have monitored the CDCR process for a while, the updates are actively sent to 
the target without a problem. However the tlog size and files count are growing 
everyday, even when there is 0 updates to sent, the tlog stays there:

Following is from the action=queues command, and you can see after about a 
month or so running days, the total transaction are reaching to 140K total 
files, and size is about 103G.



0
465




0
2017-07-07T23:19:09.655Z



102740042616
140809
stopped


Any help on it? Or do I need to configure something else? The CDCR 
configuration is pretty much following the wiki:

On target:

  

  disabled

  

  


  

  

  cdcr-processor-chain

  

  

  ${solr.ulog.dir:}

 
  ${solr.autoCommit.maxTime:18}
  false 


 
  ${solr.autoSoftCommit.maxTime:3}
 
  

On source:
  

  ${TargetZk}
  MY_COLLECTION
  MY_COLLECTION



  1
  1000
  128



  6

  

  

  ${solr.ulog.dir:}

 
  ${solr.autoCommit.maxTime:18}
  false 


 
  ${solr.autoSoftCommit.maxTime:3}
 
  

Thanks.
Sean

On 7/8/17, 12:10 PM, "Erick Erickson"  wrote:

This should not be the case if you are actively sending updates to the
target cluster. The tlog is used to store unsent updates, so if the
connection is broken for some time, the target cluster will have a
chance to catch up.

If you don't have the remote DC online and do not intend to bring it
online soon, you should turn CDCR off.

Best,
Erick

On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean  wrote:
> Once enabled CDCR, update log stores an unlimited number of entries. This 
is causing the tlog folder getting bigger and bigger, as well as the open files 
are growing. How can one reduce the number of open files and also to reduce the 
tlog files? If it’s not taken care properly, sooner or later the log files size 
and open file count will exceed the limits.
>
> Thanks
> Sean
>
>
> Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.




Re: CDCR - how to deal with the transaction log files

2017-07-08 Thread Erick Erickson
This should not be the case if you are actively sending updates to the
target cluster. The tlog is used to store unsent updates, so if the
connection is broken for some time, the target cluster will have a
chance to catch up.

If you don't have the remote DC online and do not intend to bring it
online soon, you should turn CDCR off.

Best,
Erick

On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean  wrote:
> Once enabled CDCR, update log stores an unlimited number of entries. This is 
> causing the tlog folder getting bigger and bigger, as well as the open files 
> are growing. How can one reduce the number of open files and also to reduce 
> the tlog files? If it’s not taken care properly, sooner or later the log 
> files size and open file count will exceed the limits.
>
> Thanks
> Sean
>
>
> Confidentiality Notice::  This email, including attachments, may include 
> non-public, proprietary, confidential or legally privileged information.  If 
> you are not an intended recipient or an authorized agent of an intended 
> recipient, you are hereby notified that any dissemination, distribution or 
> copying of the information contained in or transmitted with this e-mail is 
> unauthorized and strictly prohibited.  If you have received this email in 
> error, please notify the sender by replying to this message and permanently 
> delete this e-mail, its attachments, and any copies of it immediately.  You 
> should not retain, copy or use this e-mail or any attachment for any purpose, 
> nor disclose all or any part of the contents to any other person. Thank you.