Re: [cisco-voip] Constantly having db replication issues

2016-04-20 Thread Brian Meade
You can also do "file tail activelog cm/trace/dbl/sdi recent" which will
tail the most recent log file in the directory.  Saves you a few seconds.

On Wed, Apr 20, 2016 at 6:00 PM, Daniel Ohnesorge via cisco-voip <
cisco-voip@puck.nether.net> wrote:

> Hi Nick,
>
> At the time of failure, I like to tail the dbl logs to see whats
> happening. First, run the command to see which log is the latest; file list
> activelog cm/trace/dbl/sdi date detail. That output will list the last
> written log file at the bottom for example log16.log. Then you can tail
> file using the command; file tail activelog cm/trace/dbl/sdi/log16.log.
> This will give you some idea of what is happening at that time.
>
> On 2016-04-21 07:20, Nick Barnett wrote:
>
> Thanks James
>
> Ok, yes, there's a lot in rhosts. They are all identical, and each of them
> has forward and reverse lookups.
>
>
>
> On Wed, Apr 20, 2016 at 12:39 PM, James Buchanan <
> james.buchan...@gmail.com> wrote:
>
>> Hello,
>>
>> Even though you are not using DNS, do you have DNS servers and a domain
>> name configured? If so, you should have forward and reverse entries
>> configured for all servers. When you look in Unified Reporting, do you see
>> anything about the rhosts under Database Status?
>>
>> Thanks,
>>
>> James
>>
>> On Wed, Apr 20, 2016 at 1:07 PM, Nick Barnett 
>> wrote:
>>
>>> Thanks Ryan.
>>>
>>> We have 3 CCM and 1 TFTP node in each of our two data centers. The main
>>> data center is here, and that is where our DRS sftp server (and publisher)
>>> is located. Nothing is using DNS right now, all of the servers are entered
>>> into CUCM as IP addresses... this cluster has been around for years. It was
>>> upgraded from 7.BeforeMyTime to 8.6 to 10.0.
>>>
>>>
>>>
>>> On Wed, Apr 20, 2016 at 11:54 AM, Ryan Huff 
>>> wrote:
>>>
 Hi Nick.

 Let me ask you a few things;

 - How is the cluster laid out (how many nodes in the cluster and what
 nodes are in which DC)?

 - Are you using DNS and if so, where is the DNS server located and do
 you have redundant DNS in both DCs?

 - Where is your DRS server in relation to the cluster publisher (same
 DC or no)?

 Thanks,

 Ryan

 On Apr 20, 2016, at 11:09 AM, Nick Barnett 
 wrote:

 I'm wondering how many others have had as many issues with db
 replication? It seems that any time we lose a connection to our 2nd data
 center (even a 2 minute MPLS planned maintenance outage causes the issue),
 our database synchronization has errors.  After a WAN blip, within an hour
 or so, I get a message from RTMT about a subscriber being in "blocked"
 state:


 %[AppID=Cisco Database Layer
 Monitor][ClusterID=ProdVoiceCluster][NodeID=XXX1]: A change
 notification client is busy (blocked). If the change notification client
 continues to be blocked for 10 minutes, the system automatically clears the
 block and change notification should resume successfully."



 After that, if I run utils dbreplication status, it will have errors...
 so then I run the "repair all" option and it fixes it. Then I'm good for a
 few weeks until something else happens that starts the whole cycle over.

 Something else that happens after a WAN blip is that DRS begins to
 fail, so we have to restart the master DRS and the subsequent DRS services
 on the subs. Am I doing something wrong? Is this normal?

 I'm on CUCM 10.0.1.12900-2.

 Thanks,
 Nick

 ___
 cisco-voip mailing list
 cisco-voip@puck.nether.net
 https://puck.nether.net/mailman/listinfo/cisco-voip


>>> ___
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>>
>>
> ___
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
>
>
>
> ___
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
>
___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip


Re: [cisco-voip] Constantly having db replication issues

2016-04-20 Thread Daniel Ohnesorge via cisco-voip
 

Hi Nick, 

At the time of failure, I like to tail the dbl logs to see whats
happening. First, run the command to see which log is the latest; file
list activelog cm/trace/dbl/sdi date detail. That output will list the
last written log file at the bottom for example log16.log. Then you can
tail file using the command; file tail activelog
cm/trace/dbl/sdi/log16.log. This will give you some idea of what is
happening at that time. 

On 2016-04-21 07:20, Nick Barnett wrote: 

> Thanks James 
> Ok, yes, there's a lot in rhosts. They are all identical, and each of them 
> has forward and reverse lookups. 
> 
> On Wed, Apr 20, 2016 at 12:39 PM, James Buchanan  
> wrote:
> 
> Hello,
> 
> Even though you are not using DNS, do you have DNS servers and a domain name 
> configured? If so, you should have forward and reverse entries configured for 
> all servers. When you look in Unified Reporting, do you see anything about 
> the rhosts under Database Status?
> 
> Thanks,
> 
> James 
> 
> On Wed, Apr 20, 2016 at 1:07 PM, Nick Barnett  wrote:
> 
> Thanks Ryan. 
> 
> We have 3 CCM and 1 TFTP node in each of our two data centers. The main data 
> center is here, and that is where our DRS sftp server (and publisher) is 
> located. Nothing is using DNS right now, all of the servers are entered into 
> CUCM as IP addresses... this cluster has been around for years. It was 
> upgraded from 7.BeforeMyTime to 8.6 to 10.0. 
> 
> On Wed, Apr 20, 2016 at 11:54 AM, Ryan Huff  wrote:
> 
> Hi Nick. 
> 
> Let me ask you a few things; 
> 
> - How is the cluster laid out (how many nodes in the cluster and what nodes 
> are in which DC)? 
> 
> - Are you using DNS and if so, where is the DNS server located and do you 
> have redundant DNS in both DCs? 
> 
> - Where is your DRS server in relation to the cluster publisher (same DC or 
> no)?
> 
> Thanks, 
> 
> Ryan 
> 
> On Apr 20, 2016, at 11:09 AM, Nick Barnett  wrote:
> 
> I'm wondering how many others have had as many issues with db replication? It 
> seems that any time we lose a connection to our 2nd data center (even a 2 
> minute MPLS planned maintenance outage causes the issue), our database 
> synchronization has errors.  After a WAN blip, within an hour or so, I get a 
> message from RTMT about a subscriber being in "blocked" state: 
> 
> %[AppID=Cisco Database Layer 
> Monitor][ClusterID=ProdVoiceCluster][NodeID=XXX1]: A change notification 
> client is busy (blocked). If the change notification client continues to be 
> blocked for 10 minutes, the system automatically clears the block and change 
> notification should resume successfully." 
> 
> After that, if I run utils dbreplication status, it will have errors... so 
> then I run the "repair all" option and it fixes it. Then I'm good for a few 
> weeks until something else happens that starts the whole cycle over. 
> 
> Something else that happens after a WAN blip is that DRS begins to fail, so 
> we have to restart the master DRS and the subsequent DRS services on the 
> subs. Am I doing something wrong? Is this normal? 
> 
> I'm on CUCM 10.0.1.12900-2.   
> 
> Thanks,
> Nick 
> ___
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip

___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip

___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip 

  ___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip


Re: [cisco-voip] Constantly having db replication issues

2016-04-20 Thread Nick Barnett
Thanks James

Ok, yes, there's a lot in rhosts. They are all identical, and each of them
has forward and reverse lookups.



On Wed, Apr 20, 2016 at 12:39 PM, James Buchanan 
wrote:

> Hello,
>
> Even though you are not using DNS, do you have DNS servers and a domain
> name configured? If so, you should have forward and reverse entries
> configured for all servers. When you look in Unified Reporting, do you see
> anything about the rhosts under Database Status?
>
> Thanks,
>
> James
>
> On Wed, Apr 20, 2016 at 1:07 PM, Nick Barnett 
> wrote:
>
>> Thanks Ryan.
>>
>> We have 3 CCM and 1 TFTP node in each of our two data centers. The main
>> data center is here, and that is where our DRS sftp server (and publisher)
>> is located. Nothing is using DNS right now, all of the servers are entered
>> into CUCM as IP addresses... this cluster has been around for years. It was
>> upgraded from 7.BeforeMyTime to 8.6 to 10.0.
>>
>>
>>
>> On Wed, Apr 20, 2016 at 11:54 AM, Ryan Huff  wrote:
>>
>>> Hi Nick.
>>>
>>> Let me ask you a few things;
>>>
>>> - How is the cluster laid out (how many nodes in the cluster and what
>>> nodes are in which DC)?
>>>
>>> - Are you using DNS and if so, where is the DNS server located and do
>>> you have redundant DNS in both DCs?
>>>
>>> - Where is your DRS server in relation to the cluster publisher (same DC
>>> or no)?
>>>
>>> Thanks,
>>>
>>> Ryan
>>>
>>> On Apr 20, 2016, at 11:09 AM, Nick Barnett 
>>> wrote:
>>>
>>> I'm wondering how many others have had as many issues with db
>>> replication? It seems that any time we lose a connection to our 2nd data
>>> center (even a 2 minute MPLS planned maintenance outage causes the issue),
>>> our database synchronization has errors.  After a WAN blip, within an hour
>>> or so, I get a message from RTMT about a subscriber being in "blocked"
>>> state:
>>>
>>> %[AppID=Cisco Database Layer
>>> Monitor][ClusterID=ProdVoiceCluster][NodeID=XXX1]: A change
>>> notification client is busy (blocked). If the change notification client
>>> continues to be blocked for 10 minutes, the system automatically clears the
>>> block and change notification should resume successfully."
>>>
>>>
>>> After that, if I run utils dbreplication status, it will have errors...
>>> so then I run the "repair all" option and it fixes it. Then I'm good for a
>>> few weeks until something else happens that starts the whole cycle over.
>>>
>>> Something else that happens after a WAN blip is that DRS begins to fail,
>>> so we have to restart the master DRS and the subsequent DRS services on the
>>> subs. Am I doing something wrong? Is this normal?
>>>
>>> I'm on CUCM 10.0.1.12900-2.
>>>
>>> Thanks,
>>> Nick
>>>
>>> ___
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>>
>>>
>>
>> ___
>> cisco-voip mailing list
>> cisco-voip@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>
>>
>
___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip


Re: [cisco-voip] Constantly having db replication issues

2016-04-20 Thread Ryan Huff
Nick,

Each time time you have one of these DB replication issues, have you always 
been able to tie it into a WAN event? The reason I ask is because you may be 
having these issues regardless of WAN, although once or twice it has lined up 
with a WAN event.

Do me a flavor; send me the output of:

- utils diagnose test
- utils ntp server list
- utils dbreplication runtimestate
- show network cluster
- run sql select name,description,nodeid from processnode

That is a lot of output, so you may want to throw it in a spreadsheet or 
something instead of inline to this email. All of this syntax should be ran 
from the CLI of the CUCM publisher.

Thanks,

Ryan

Sent from my iPad

On Apr 20, 2016, at 1:08 PM, Nick Barnett 
mailto:nicksbarn...@gmail.com>> wrote:

Thanks Ryan.

We have 3 CCM and 1 TFTP node in each of our two data centers. The main data 
center is here, and that is where our DRS sftp server (and publisher) is 
located. Nothing is using DNS right now, all of the servers are entered into 
CUCM as IP addresses... this cluster has been around for years. It was upgraded 
from 7.BeforeMyTime to 8.6 to 10.0.



On Wed, Apr 20, 2016 at 11:54 AM, Ryan Huff 
mailto:ryanh...@outlook.com>> wrote:
Hi Nick.

Let me ask you a few things;

- How is the cluster laid out (how many nodes in the cluster and what nodes are 
in which DC)?

- Are you using DNS and if so, where is the DNS server located and do you have 
redundant DNS in both DCs?

- Where is your DRS server in relation to the cluster publisher (same DC or no)?

Thanks,

Ryan

On Apr 20, 2016, at 11:09 AM, Nick Barnett 
mailto:nicksbarn...@gmail.com>> wrote:

I'm wondering how many others have had as many issues with db replication? It 
seems that any time we lose a connection to our 2nd data center (even a 2 
minute MPLS planned maintenance outage causes the issue), our database 
synchronization has errors.  After a WAN blip, within an hour or so, I get a 
message from RTMT about a subscriber being in "blocked" state:


%[AppID=Cisco Database Layer 
Monitor][ClusterID=ProdVoiceCluster][NodeID=XXX1]: A change notification 
client is busy (blocked). If the change notification client continues to be 
blocked for 10 minutes, the system automatically clears the block and change 
notification should resume successfully."


After that, if I run utils dbreplication status, it will have errors... so then 
I run the "repair all" option and it fixes it. Then I'm good for a few weeks 
until something else happens that starts the whole cycle over.

Something else that happens after a WAN blip is that DRS begins to fail, so we 
have to restart the master DRS and the subsequent DRS services on the subs. Am 
I doing something wrong? Is this normal?

I'm on CUCM 10.0.1.12900-2.

Thanks,
Nick

___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip

___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip


Re: [cisco-voip] Constantly having db replication issues

2016-04-20 Thread James Buchanan
Hello,

Even though you are not using DNS, do you have DNS servers and a domain
name configured? If so, you should have forward and reverse entries
configured for all servers. When you look in Unified Reporting, do you see
anything about the rhosts under Database Status?

Thanks,

James

On Wed, Apr 20, 2016 at 1:07 PM, Nick Barnett 
wrote:

> Thanks Ryan.
>
> We have 3 CCM and 1 TFTP node in each of our two data centers. The main
> data center is here, and that is where our DRS sftp server (and publisher)
> is located. Nothing is using DNS right now, all of the servers are entered
> into CUCM as IP addresses... this cluster has been around for years. It was
> upgraded from 7.BeforeMyTime to 8.6 to 10.0.
>
>
>
> On Wed, Apr 20, 2016 at 11:54 AM, Ryan Huff  wrote:
>
>> Hi Nick.
>>
>> Let me ask you a few things;
>>
>> - How is the cluster laid out (how many nodes in the cluster and what
>> nodes are in which DC)?
>>
>> - Are you using DNS and if so, where is the DNS server located and do you
>> have redundant DNS in both DCs?
>>
>> - Where is your DRS server in relation to the cluster publisher (same DC
>> or no)?
>>
>> Thanks,
>>
>> Ryan
>>
>> On Apr 20, 2016, at 11:09 AM, Nick Barnett 
>> wrote:
>>
>> I'm wondering how many others have had as many issues with db
>> replication? It seems that any time we lose a connection to our 2nd data
>> center (even a 2 minute MPLS planned maintenance outage causes the issue),
>> our database synchronization has errors.  After a WAN blip, within an hour
>> or so, I get a message from RTMT about a subscriber being in "blocked"
>> state:
>>
>> %[AppID=Cisco Database Layer
>> Monitor][ClusterID=ProdVoiceCluster][NodeID=XXX1]: A change
>> notification client is busy (blocked). If the change notification client
>> continues to be blocked for 10 minutes, the system automatically clears the
>> block and change notification should resume successfully."
>>
>>
>> After that, if I run utils dbreplication status, it will have errors...
>> so then I run the "repair all" option and it fixes it. Then I'm good for a
>> few weeks until something else happens that starts the whole cycle over.
>>
>> Something else that happens after a WAN blip is that DRS begins to fail,
>> so we have to restart the master DRS and the subsequent DRS services on the
>> subs. Am I doing something wrong? Is this normal?
>>
>> I'm on CUCM 10.0.1.12900-2.
>>
>> Thanks,
>> Nick
>>
>> ___
>> cisco-voip mailing list
>> cisco-voip@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>
>>
>
> ___
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
>
___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip


Re: [cisco-voip] Constantly having db replication issues

2016-04-20 Thread Nick Barnett
Thanks Ryan.

We have 3 CCM and 1 TFTP node in each of our two data centers. The main
data center is here, and that is where our DRS sftp server (and publisher)
is located. Nothing is using DNS right now, all of the servers are entered
into CUCM as IP addresses... this cluster has been around for years. It was
upgraded from 7.BeforeMyTime to 8.6 to 10.0.



On Wed, Apr 20, 2016 at 11:54 AM, Ryan Huff  wrote:

> Hi Nick.
>
> Let me ask you a few things;
>
> - How is the cluster laid out (how many nodes in the cluster and what
> nodes are in which DC)?
>
> - Are you using DNS and if so, where is the DNS server located and do you
> have redundant DNS in both DCs?
>
> - Where is your DRS server in relation to the cluster publisher (same DC
> or no)?
>
> Thanks,
>
> Ryan
>
> On Apr 20, 2016, at 11:09 AM, Nick Barnett  wrote:
>
> I'm wondering how many others have had as many issues with db replication?
> It seems that any time we lose a connection to our 2nd data center (even a
> 2 minute MPLS planned maintenance outage causes the issue), our database
> synchronization has errors.  After a WAN blip, within an hour or so, I get
> a message from RTMT about a subscriber being in "blocked" state:
>
> %[AppID=Cisco Database Layer
> Monitor][ClusterID=ProdVoiceCluster][NodeID=XXX1]: A change
> notification client is busy (blocked). If the change notification client
> continues to be blocked for 10 minutes, the system automatically clears the
> block and change notification should resume successfully."
>
>
> After that, if I run utils dbreplication status, it will have errors... so
> then I run the "repair all" option and it fixes it. Then I'm good for a few
> weeks until something else happens that starts the whole cycle over.
>
> Something else that happens after a WAN blip is that DRS begins to fail,
> so we have to restart the master DRS and the subsequent DRS services on the
> subs. Am I doing something wrong? Is this normal?
>
> I'm on CUCM 10.0.1.12900-2.
>
> Thanks,
> Nick
>
> ___
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
>
___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip


Re: [cisco-voip] Constantly having db replication issues

2016-04-20 Thread Ryan Huff
Hi Nick.

Let me ask you a few things;

- How is the cluster laid out (how many nodes in the cluster and what nodes are 
in which DC)?

- Are you using DNS and if so, where is the DNS server located and do you have 
redundant DNS in both DCs?

- Where is your DRS server in relation to the cluster publisher (same DC or no)?

Thanks,

Ryan

On Apr 20, 2016, at 11:09 AM, Nick Barnett 
mailto:nicksbarn...@gmail.com>> wrote:

I'm wondering how many others have had as many issues with db replication? It 
seems that any time we lose a connection to our 2nd data center (even a 2 
minute MPLS planned maintenance outage causes the issue), our database 
synchronization has errors.  After a WAN blip, within an hour or so, I get a 
message from RTMT about a subscriber being in "blocked" state:


%[AppID=Cisco Database Layer 
Monitor][ClusterID=ProdVoiceCluster][NodeID=XXX1]: A change notification 
client is busy (blocked). If the change notification client continues to be 
blocked for 10 minutes, the system automatically clears the block and change 
notification should resume successfully."


After that, if I run utils dbreplication status, it will have errors... so then 
I run the "repair all" option and it fixes it. Then I'm good for a few weeks 
until something else happens that starts the whole cycle over.

Something else that happens after a WAN blip is that DRS begins to fail, so we 
have to restart the master DRS and the subsequent DRS services on the subs. Am 
I doing something wrong? Is this normal?

I'm on CUCM 10.0.1.12900-2.

Thanks,
Nick

___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip


[cisco-voip] Constantly having db replication issues

2016-04-20 Thread Nick Barnett
I'm wondering how many others have had as many issues with db replication?
It seems that any time we lose a connection to our 2nd data center (even a
2 minute MPLS planned maintenance outage causes the issue), our database
synchronization has errors.  After a WAN blip, within an hour or so, I get
a message from RTMT about a subscriber being in "blocked" state:

%[AppID=Cisco Database Layer
Monitor][ClusterID=ProdVoiceCluster][NodeID=XXX1]: A change
notification client is busy (blocked). If the change notification client
continues to be blocked for 10 minutes, the system automatically clears the
block and change notification should resume successfully."


After that, if I run utils dbreplication status, it will have errors... so
then I run the "repair all" option and it fixes it. Then I'm good for a few
weeks until something else happens that starts the whole cycle over.

Something else that happens after a WAN blip is that DRS begins to fail, so
we have to restart the master DRS and the subsequent DRS services on the
subs. Am I doing something wrong? Is this normal?

I'm on CUCM 10.0.1.12900-2.

Thanks,
Nick
___
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip