Re: Cassandra takes more time to join the cluster

2017-12-27 Thread sat
Hi,

This issue is related to
https://issues.apache.org/jira/browse/CASSANDRA-9630

Thanks and Regards
A.SathishKumar

On Sun, Dec 24, 2017 at 2:59 AM, sat <sathish.al...@gmail.com> wrote:

> Hi,
>
> We have 3 node cluster, before we reboot one of the VMWare VM/node (RHEL
> 7.2) all 3 nodes formed the cluster without any issues. However after
> reboot, we noticed rebooted node (7 out of 10 times) takes more time to
> join approximately 10  - 15min.
>
> Cassandra - 3.9 version
>
> While investigating the issue further we noticed
>
>- Node 1 (rebooted node) able to send "SYN", "ACK2" messages for both
>the nodes (Node 2, Node 3) even though nodetool status displays "Node 2 and
>3 as "DN"" only in "Node 1"
>- After 10 - 15min we noticed *"Connection Timeout"*  exception in
>Node 2 and 3. being thrown from OutboundTcpConnection.java (line # 311)
>which triggers a state change event to "Node 1" and changes the state as
>"UN".
>
>
>
>> *if (logger.isTraceEnabled())**logger.trace("error
>> writing to {}", poolReference.endPoint(), e);*
>
>
> Please let us know what triggers "Connection TimeOut" exception in "Node 2
> and 3" and ways to resolve this.
>
>
> Thanks and Regards
> A.SathishKumar
>
> On Fri, Dec 22, 2017 at 12:06 PM, sat <sathish.al...@gmail.com> wrote:
>
>> Hi Jeff,
>>
>> Thanks for your prompt  response. Please find logs. Does Gossip has
>> dependency with NTP, because we find the node we reboot takes time to sync
>> with NTP.
>>
>> When will FailureDetector message kick in, we have this issue only when
>> we see Gossiper and Failure Detector log message.
>>
>> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 FailureDetector.java:272 -
>> Average for /10.63.114.158 is 9.977908204545455E8
>> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 Gossiper.java:1140 - /
>> 10.63.114.158 local generation 1513967155, remote generation 1513967155
>> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 Gossiper.java:1193 -
>> Updating heartbeat state version to 14666 from 14663 for /10.63.114.158
>> ...
>> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 Gossiper.java:986 - Sending
>> a EchoMessage to /10.63.114.158
>> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 MessagingService.java:760 -
>> /10.63.114.150 sending ECHO to 6794@/10.63.114.158
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,856 Gossiper.java:144 - My
>> heartbeat is now 2258
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,856 Gossiper.java:500 - Gossip
>> Digests are : /10.63.114.150:1513971247:2258 /10.63.114.154:1513967154
>> :14664 /10.63.114.158:1513967155:14666
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 Gossiper.java:646 - Sending
>> a GossipDigestSyn to /10.63.114.158 ...
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 MessagingService.java:760 -
>> /10.63.114.150 sending GOSSIP_DIGEST_SYN to 6795@/10.63.114.158
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 Gossiper.java:646 - Sending
>> a GossipDigestSyn to /10.63.114.154 ...
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 MessagingService.java:760 -
>> /10.63.114.150 sending GOSSIP_DIGEST_SYN to 6796@/10.63.114.154
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 Gossiper.java:757 -
>> Performing status check ...
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:298 -
>> PHI for /10.63.114.158 : 0.09877714244264273
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:315 -
>> PHI for /10.63.114.158 : 0.09877714244264273
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:316 -
>> mean for /10.63.114.158 : 9.977908204545455E8
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:298 -
>> PHI for /10.63.114.154 : 0.27857029580273446
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,858 FailureDetector.java:315 -
>> PHI for /10.63.114.154 : 0.27857029580273446
>> TRACE [GossipTasks:1] 2017-12-22 12:46:26,858 FailureDetector.java:316 -
>> mean for /10.63.114.154 : 7.65247444583815E8
>> TRACE [GossipStage:1] 2017-12-22 12:46:26,858
>> GossipDigestAckVerbHandler.java:41 - Received a GossipDigestAckMessage
>> from /10.63.114.158
>> TRACE [GossipStage:1] 2017-12-22 12:46:26,858
>> GossipDigestAckVerbHandler.java:52 - Received ack with 1 digests and 0
>> states
>> TRACE [GossipStage:1] 2017-12-22 12:46:26,858 Gossiper.java:889 - local
>> heartbeat version 2258 greater than 2257 for /10.63.114.150
>> TRACE [GossipStage:1] 2017-12-22 12:46:2

Re: Discrepancy in nodetool status

2017-12-27 Thread sat
Hi,

We guess the issue we are facing is related to

https://issues.apache.org/jira/browse/CASSANDRA-9630.

Will it be fixed in 3.11 release.

Thanks and Regards
A.SathishKumar

On Fri, Dec 22, 2017 at 6:15 PM, sat <sathish.al...@gmail.com> wrote:

> Hi,
>
> We tried rebooting again Node 1 and this time we observed  nodetool status
> displaying "UN" for all 3 nodes in node1.
>
> Executing nodetool status on "Node 3" displays "UN" for all the nodes.
>
> Executing nodetool status on "Node 2" displays "DN" for node 1 (rebooted
> node) and "UN" for other 2 nodes. We also observed "Node 2" sending "Syn"
> message to "Node 1", but no "Ack" received from "Node 1" initially for
> 15min and then it is started receiving it.
>
> Please let us know the reason why "Node 2" is not receiving any "Ack"
> message from "Node 1" for 10 -15 minutes and how it suddenly able to
> receive "Ac" messages.
>
> Inter node communication port is 7000.
>
> 3 node cluster, and we have all 3 nodes as seed ips.
>
> Thanks and Regards
> A.SathishKumar
>
>
>
> On Fri, Dec 22, 2017 at 5:15 PM, sat <sathish.al...@gmail.com> wrote:
>
>> Hi,
>>
>> We checked and we were able to telnet to port 7000.
>>
>> Thanks and Regards
>> A.SathishKumar
>>
>> On Fri, Dec 22, 2017 at 3:43 PM, Nitan Kainth <nitankai...@gmail.com>
>> wrote:
>>
>>> Try telnet on your listen port. It must be network issue due to port or
>>> firewall issue.
>>>
>>> Sent from my iPhone
>>>
>>> On Dec 22, 2017, at 5:28 PM, sat <sathish.al...@gmail.com> wrote:
>>>
>>>
>>>
>>> Hi,
>>>
>>> We have 3 nodes in cluster, we rebooted one of the cassandra VM, we
>>> noticed nodetool status returning "UN" for itself and "DN" for other node,
>>> although we observe gossip sync and ack messages being shared between these
>>> nodes.
>>>
>>> *Issue in Detail*
>>>
>>> *Nodes in cluster*
>>> Node1
>>> Node 2
>>> Node 3
>>>
>>> All above nodes formed cluster and nodetool status in all 3 machines
>>> were "UN"
>>>
>>> We rebooted Node 1 and restarted cassandra on node1, then we ran
>>> nodetool status and observed
>>>
>>> Node 1 - UN
>>> Node 2 - DN
>>> Node 3 - DN
>>>
>>> However when we run nodetool status on other 2 nodes (Node2, and Node 3)
>>> they claim all 3 nodes are "UN"
>>>
>>> We enabled "Trace" level and checked Gossip messages and noticed "SYN",
>>> "ACK" and "ACK2" initiated and received messages in Node 1 for other 2
>>> nodes, but still nodetool status marks other 2 nodes as down.
>>>
>>> Please let us know how nodetool detects other nodes as "DOWN". Any help
>>> is highly appreciated.
>>>
>>> Thanks
>>> A.SathishKumar
>>>
>>>
>>
>>
>> --
>> A.SathishKumar
>> 044-24735023
>>
>
>
>
> --
> A.SathishKumar
> 044-24735023
>



-- 
A.SathishKumar
044-24735023


Re: Cassandra takes more time to join the cluster

2017-12-24 Thread sat
Hi,

We have 3 node cluster, before we reboot one of the VMWare VM/node (RHEL
7.2) all 3 nodes formed the cluster without any issues. However after
reboot, we noticed rebooted node (7 out of 10 times) takes more time to
join approximately 10  - 15min.

Cassandra - 3.9 version

While investigating the issue further we noticed

   - Node 1 (rebooted node) able to send "SYN", "ACK2" messages for both
   the nodes (Node 2, Node 3) even though nodetool status displays "Node 2 and
   3 as "DN"" only in "Node 1"
   - After 10 - 15min we noticed *"Connection Timeout"*  exception in Node
   2 and 3. being thrown from OutboundTcpConnection.java (line # 311)  which
   triggers a state change event to "Node 1" and changes the state as "UN".



> *if (logger.isTraceEnabled())**logger.trace("error
> writing to {}", poolReference.endPoint(), e);*


Please let us know what triggers "Connection TimeOut" exception in "Node 2
and 3" and ways to resolve this.


Thanks and Regards
A.SathishKumar

On Fri, Dec 22, 2017 at 12:06 PM, sat <sathish.al...@gmail.com> wrote:

> Hi Jeff,
>
> Thanks for your prompt  response. Please find logs. Does Gossip has
> dependency with NTP, because we find the node we reboot takes time to sync
> with NTP.
>
> When will FailureDetector message kick in, we have this issue only when we
> see Gossiper and Failure Detector log message.
>
> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 FailureDetector.java:272 -
> Average for /10.63.114.158 is 9.977908204545455E8
> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 Gossiper.java:1140 - /
> 10.63.114.158 local generation 1513967155, remote generation 1513967155
> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 Gossiper.java:1193 -
> Updating heartbeat state version to 14666 from 14663 for /10.63.114.158
> ...
> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 Gossiper.java:986 - Sending
> a EchoMessage to /10.63.114.158
> TRACE [GossipStage:1] 2017-12-22 12:46:26,759 MessagingService.java:760 - /
> 10.63.114.150 sending ECHO to 6794@/10.63.114.158
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,856 Gossiper.java:144 - My
> heartbeat is now 2258
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,856 Gossiper.java:500 - Gossip
> Digests are : /10.63.114.150:1513971247:2258 /10.63.114.154:1513967154:14664
> /10.63.114.158:1513967155:14666
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 Gossiper.java:646 - Sending
> a GossipDigestSyn to /10.63.114.158 ...
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 MessagingService.java:760 - /
> 10.63.114.150 sending GOSSIP_DIGEST_SYN to 6795@/10.63.114.158
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 Gossiper.java:646 - Sending
> a GossipDigestSyn to /10.63.114.154 ...
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 MessagingService.java:760 - /
> 10.63.114.150 sending GOSSIP_DIGEST_SYN to 6796@/10.63.114.154
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 Gossiper.java:757 -
> Performing status check ...
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:298 -
> PHI for /10.63.114.158 : 0.09877714244264273
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:315 -
> PHI for /10.63.114.158 : 0.09877714244264273
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:316 -
> mean for /10.63.114.158 : 9.977908204545455E8
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:298 -
> PHI for /10.63.114.154 : 0.27857029580273446
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,858 FailureDetector.java:315 -
> PHI for /10.63.114.154 : 0.27857029580273446
> TRACE [GossipTasks:1] 2017-12-22 12:46:26,858 FailureDetector.java:316 -
> mean for /10.63.114.154 : 7.65247444583815E8
> TRACE [GossipStage:1] 2017-12-22 12:46:26,858 
> GossipDigestAckVerbHandler.java:41
> - Received a GossipDigestAckMessage from /10.63.114.158
> TRACE [GossipStage:1] 2017-12-22 12:46:26,858 
> GossipDigestAckVerbHandler.java:52
> - Received ack with 1 digests and 0 states
> TRACE [GossipStage:1] 2017-12-22 12:46:26,858 Gossiper.java:889 - local
> heartbeat version 2258 greater than 2257 for /10.63.114.150
> TRACE [GossipStage:1] 2017-12-22 12:46:26,858 
> GossipDigestAckVerbHandler.java:84
> - Sending a GossipDigestAck2Message to /10.63.114.158
> TRACE [GossipStage:1] 2017-12-22 12:46:26,858 MessagingService.java:760 - /
> 10.63.114.150 sending GOSSIP_DIGEST_ACK2 to 6797@/10.63.114.158
> TRACE [GossipStage:1] 2017-12-22 12:46:26,858 
> GossipDigestAckVerbHandler.java:41
> - Received a GossipDigestAckMessage from /10.63.114.154
> TRACE [GossipStage:1] 2017-12-22 12:46:26,858 
> GossipDigestAckVerbHandler.java:52
> - Received ack with 2 digests and 1 states
> TRACE [GossipSta

Re: Discrepancy in nodetool status

2017-12-22 Thread sat
Hi,

We tried rebooting again Node 1 and this time we observed  nodetool status
displaying "UN" for all 3 nodes in node1.

Executing nodetool status on "Node 3" displays "UN" for all the nodes.

Executing nodetool status on "Node 2" displays "DN" for node 1 (rebooted
node) and "UN" for other 2 nodes. We also observed "Node 2" sending "Syn"
message to "Node 1", but no "Ack" received from "Node 1" initially for
15min and then it is started receiving it.

Please let us know the reason why "Node 2" is not receiving any "Ack"
message from "Node 1" for 10 -15 minutes and how it suddenly able to
receive "Ac" messages.

Inter node communication port is 7000.

3 node cluster, and we have all 3 nodes as seed ips.

Thanks and Regards
A.SathishKumar



On Fri, Dec 22, 2017 at 5:15 PM, sat <sathish.al...@gmail.com> wrote:

> Hi,
>
> We checked and we were able to telnet to port 7000.
>
> Thanks and Regards
> A.SathishKumar
>
> On Fri, Dec 22, 2017 at 3:43 PM, Nitan Kainth <nitankai...@gmail.com>
> wrote:
>
>> Try telnet on your listen port. It must be network issue due to port or
>> firewall issue.
>>
>> Sent from my iPhone
>>
>> On Dec 22, 2017, at 5:28 PM, sat <sathish.al...@gmail.com> wrote:
>>
>>
>>
>> Hi,
>>
>> We have 3 nodes in cluster, we rebooted one of the cassandra VM, we
>> noticed nodetool status returning "UN" for itself and "DN" for other node,
>> although we observe gossip sync and ack messages being shared between these
>> nodes.
>>
>> *Issue in Detail*
>>
>> *Nodes in cluster*
>> Node1
>> Node 2
>> Node 3
>>
>> All above nodes formed cluster and nodetool status in all 3 machines were
>> "UN"
>>
>> We rebooted Node 1 and restarted cassandra on node1, then we ran nodetool
>> status and observed
>>
>> Node 1 - UN
>> Node 2 - DN
>> Node 3 - DN
>>
>> However when we run nodetool status on other 2 nodes (Node2, and Node 3)
>> they claim all 3 nodes are "UN"
>>
>> We enabled "Trace" level and checked Gossip messages and noticed "SYN",
>> "ACK" and "ACK2" initiated and received messages in Node 1 for other 2
>> nodes, but still nodetool status marks other 2 nodes as down.
>>
>> Please let us know how nodetool detects other nodes as "DOWN". Any help
>> is highly appreciated.
>>
>> Thanks
>> A.SathishKumar
>>
>>
>
>
> --
> A.SathishKumar
> 044-24735023
>



-- 
A.SathishKumar
044-24735023


Re: Discrepancy in nodetool status

2017-12-22 Thread sat
Hi,

We checked and we were able to telnet to port 7000.

Thanks and Regards
A.SathishKumar

On Fri, Dec 22, 2017 at 3:43 PM, Nitan Kainth <nitankai...@gmail.com> wrote:

> Try telnet on your listen port. It must be network issue due to port or
> firewall issue.
>
> Sent from my iPhone
>
> On Dec 22, 2017, at 5:28 PM, sat <sathish.al...@gmail.com> wrote:
>
>
>
> Hi,
>
> We have 3 nodes in cluster, we rebooted one of the cassandra VM, we
> noticed nodetool status returning "UN" for itself and "DN" for other node,
> although we observe gossip sync and ack messages being shared between these
> nodes.
>
> *Issue in Detail*
>
> *Nodes in cluster*
> Node1
> Node 2
> Node 3
>
> All above nodes formed cluster and nodetool status in all 3 machines were
> "UN"
>
> We rebooted Node 1 and restarted cassandra on node1, then we ran nodetool
> status and observed
>
> Node 1 - UN
> Node 2 - DN
> Node 3 - DN
>
> However when we run nodetool status on other 2 nodes (Node2, and Node 3)
> they claim all 3 nodes are "UN"
>
> We enabled "Trace" level and checked Gossip messages and noticed "SYN",
> "ACK" and "ACK2" initiated and received messages in Node 1 for other 2
> nodes, but still nodetool status marks other 2 nodes as down.
>
> Please let us know how nodetool detects other nodes as "DOWN". Any help is
> highly appreciated.
>
> Thanks
> A.SathishKumar
>
>


-- 
A.SathishKumar
044-24735023


Discrepancy in nodetool status

2017-12-22 Thread sat
Hi,

We have 3 nodes in cluster, we rebooted one of the cassandra VM, we noticed
nodetool status returning "UN" for itself and "DN" for other node, although
we observe gossip sync and ack messages being shared between these nodes.

*Issue in Detail*

*Nodes in cluster*
Node1
Node 2
Node 3

All above nodes formed cluster and nodetool status in all 3 machines were
"UN"

We rebooted Node 1 and restarted cassandra on node1, then we ran nodetool
status and observed

Node 1 - UN
Node 2 - DN
Node 3 - DN

However when we run nodetool status on other 2 nodes (Node2, and Node 3)
they claim all 3 nodes are "UN"

We enabled "Trace" level and checked Gossip messages and noticed "SYN",
"ACK" and "ACK2" initiated and received messages in Node 1 for other 2
nodes, but still nodetool status marks other 2 nodes as down.

Please let us know how nodetool detects other nodes as "DOWN". Any help is
highly appreciated.

Thanks
A.SathishKumar


Re: Cassandra takes more time to join the cluster

2017-12-22 Thread sat
Hi Jeff,

Thanks for your prompt  response. Please find logs. Does Gossip has
dependency with NTP, because we find the node we reboot takes time to sync
with NTP.

When will FailureDetector message kick in, we have this issue only when we
see Gossiper and Failure Detector log message.

TRACE [GossipStage:1] 2017-12-22 12:46:26,759 FailureDetector.java:272 -
Average for /10.63.114.158 is 9.977908204545455E8
TRACE [GossipStage:1] 2017-12-22 12:46:26,759 Gossiper.java:1140 - /
10.63.114.158 local generation 1513967155, remote generation 1513967155
TRACE [GossipStage:1] 2017-12-22 12:46:26,759 Gossiper.java:1193 - Updating
heartbeat state version to 14666 from 14663 for /10.63.114.158 ...
TRACE [GossipStage:1] 2017-12-22 12:46:26,759 Gossiper.java:986 - Sending a
EchoMessage to /10.63.114.158
TRACE [GossipStage:1] 2017-12-22 12:46:26,759 MessagingService.java:760 - /
10.63.114.150 sending ECHO to 6794@/10.63.114.158
TRACE [GossipTasks:1] 2017-12-22 12:46:26,856 Gossiper.java:144 - My
heartbeat is now 2258
TRACE [GossipTasks:1] 2017-12-22 12:46:26,856 Gossiper.java:500 - Gossip
Digests are : /10.63.114.150:1513971247:2258 /10.63.114.154:1513967154:14664
/10.63.114.158:1513967155:14666
TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 Gossiper.java:646 - Sending a
GossipDigestSyn to /10.63.114.158 ...
TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 MessagingService.java:760 - /
10.63.114.150 sending GOSSIP_DIGEST_SYN to 6795@/10.63.114.158
TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 Gossiper.java:646 - Sending a
GossipDigestSyn to /10.63.114.154 ...
TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 MessagingService.java:760 - /
10.63.114.150 sending GOSSIP_DIGEST_SYN to 6796@/10.63.114.154
TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 Gossiper.java:757 -
Performing status check ...
TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:298 -
PHI for /10.63.114.158 : 0.09877714244264273
TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:315 -
PHI for /10.63.114.158 : 0.09877714244264273
TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:316 -
mean for /10.63.114.158 : 9.977908204545455E8
TRACE [GossipTasks:1] 2017-12-22 12:46:26,857 FailureDetector.java:298 -
PHI for /10.63.114.154 : 0.27857029580273446
TRACE [GossipTasks:1] 2017-12-22 12:46:26,858 FailureDetector.java:315 -
PHI for /10.63.114.154 : 0.27857029580273446
TRACE [GossipTasks:1] 2017-12-22 12:46:26,858 FailureDetector.java:316 -
mean for /10.63.114.154 : 7.65247444583815E8
TRACE [GossipStage:1] 2017-12-22 12:46:26,858
GossipDigestAckVerbHandler.java:41 - Received a GossipDigestAckMessage from
/10.63.114.158
TRACE [GossipStage:1] 2017-12-22 12:46:26,858
GossipDigestAckVerbHandler.java:52 - Received ack with 1 digests and 0
states
TRACE [GossipStage:1] 2017-12-22 12:46:26,858 Gossiper.java:889 - local
heartbeat version 2258 greater than 2257 for /10.63.114.150
TRACE [GossipStage:1] 2017-12-22 12:46:26,858
GossipDigestAckVerbHandler.java:84 - Sending a GossipDigestAck2Message to /
10.63.114.158
TRACE [GossipStage:1] 2017-12-22 12:46:26,858 MessagingService.java:760 - /
10.63.114.150 sending GOSSIP_DIGEST_ACK2 to 6797@/10.63.114.158
TRACE [GossipStage:1] 2017-12-22 12:46:26,858
GossipDigestAckVerbHandler.java:41 - Received a GossipDigestAckMessage from
/10.63.114.154
TRACE [GossipStage:1] 2017-12-22 12:46:26,858
GossipDigestAckVerbHandler.java:52 - Received ack with 2 digests and 1
states
TRACE [GossipStage:1] 2017-12-22 12:46:26,858 Gossiper.java:1140 - /
10.63.114.154 local generation 1513967154, remote generation 1513967154
TRACE [GossipStage:1] 2017-12-22 12:46:26,858 Gossiper.java:1193 - Updating
heartbeat state version to 14664 from 14664 for /10.63.114.154 ...
TRACE [GossipStage:1] 2017-12-22 12:46:26,858 Gossiper.java:986 - Sending a
EchoMessage to /10.63.114.154
TRACE [GossipStage:1] 2017-12-22 12:46:26,858 MessagingService.java:760 - /
10.63.114.150 sending ECHO to 6798@/10.63.114.154
TRACE [GossipStage:1] 2017-12-22 12:46:26,858 Gossiper.java:889 - local
heartbeat version 2258 greater than 2257 for /10.63.114.150
TRACE [GossipStage:1] 2017-12-22 12:46:26,858 Gossiper.java:889 - local
heartbeat version 14666 greater than 14663 for /10.63.114.158
TRACE [GossipStage:1] 2017-12-22 12:46:26,859 Gossiper.java:904 - Adding
state SEVERITY: 0.0

Thanks and Regards
A.SathishKumar



On Thu, Dec 21, 2017 at 9:40 PM, Jeff Jirsa <jji...@gmail.com> wrote:

> Not received is ambiguous.
>
> Can you paste the logs somewhere like pastebin? Sanitize/anonymize as
> needed.
>
>
>
> --
> Jeff Jirsa
>
>
> On Dec 21, 2017, at 9:15 PM, sat <sathish.al...@gmail.com> wrote:
>
> Hi,
>
> We observe gossip is not received by other nodes and we see "Convicted"
> message in debug.log even though cassandra process is running in all 3
> nodes.
>
> Thanks and Regards
> A.SathisKumar
>
> On Thu, Dec 21, 2017 at 9:12 PM, Jeff Jirsa <jji...@gmail.com>

Re: Cassandra takes more time to join the cluster

2017-12-21 Thread sat
Hi,

We observe gossip is not received by other nodes and we see "Convicted"
message in debug.log even though cassandra process is running in all 3
nodes.

Thanks and Regards
A.SathisKumar

On Thu, Dec 21, 2017 at 9:12 PM, Jeff Jirsa <jji...@gmail.com> wrote:

> What’s it logging?
>
> --
> Jeff Jirsa
>
>
> On Dec 21, 2017, at 8:56 PM, sat <sathish.al...@gmail.com> wrote:
>
> We have configured 3 node cassandra cluster in RHEL 7.2 version and we are
> doing cluster testing. When we start cassandra in all 3 nodes they form a
> cluster and they work fine.
>
> But when we bring one node down using "init 6" or "reboot" command, the
> rebooted node takes more time to join the cluster, however if we manually
> kill and start cassandra process the nodes join cluster immediately without
> any issues.
>
> We have provided all 3 IPs as seed nodes and the cluster name is same for
> all 3 nodes and their respective IP as listen address.
>
> Selinux is also disabled.
>
> Please help us in resolving this issue.
>
> Thanks
>
>


-- 
A.SathishKumar
044-24735023


Cassandra takes more time to join the cluster

2017-12-21 Thread sat
We have configured 3 node cassandra cluster in RHEL 7.2 version and we are
doing cluster testing. When we start cassandra in all 3 nodes they form a
cluster and they work fine.

But when we bring one node down using "init 6" or "reboot" command, the
rebooted node takes more time to join the cluster, however if we manually
kill and start cassandra process the nodes join cluster immediately without
any issues.

We have provided all 3 IPs as seed nodes and the cluster name is same for
all 3 nodes and their respective IP as listen address.

Selinux is also disabled.

Please help us in resolving this issue.

Thanks


Re: Priority for cassandra nodes in cluster

2016-11-12 Thread sat
Hi,

Thanks all for your valuable suggestion.

Thanks and Regards
A.SathishKumar

On Sat, Nov 12, 2016 at 2:59 PM, Ben Bromhead <b...@instaclustr.com> wrote:

> +1 w/ Benjamin.
>
> However if you wish to make use of spare hardware capacity, look to
> something like mesos DC/OS or kubernetes. You can run multiple services
> across a fleet of hardware, but provision equal resources to Cassandra and
> have somewhat reliable hardware sharing mechanisms.
>
> On Sat, 12 Nov 2016 at 14:12 Jon Haddad <jonathan.had...@gmail.com> wrote:
>
>> Agreed w/ Benjamin.  Trying to diagnose issues in prod will be a
>> nightmare.  Keep your DB servers homogeneous.
>>
>> On Nov 12, 2016, at 1:52 PM, Benjamin Roth <benjamin.r...@jaumo.com>
>> wrote:
>>
>> 1. From a 15 year experience of running distributed Services: dont Mix
>> Services on machines if you don't have to. Dedicate each server to a single
>> task if you can afford it. It is easier to manage and reduces risks in case
>> of overload or failure
>> 2. You can assign a different number of tokens for each node by setting
>> this in Cassandra.yaml before you bootstrap that node
>>
>> Am 12.11.2016 22:48 schrieb "sat" <sathish.al...@gmail.com>:
>>
>> Hi,
>>
>> We are planning to install 3 node cluster in production environment. Is
>> it possible to provide weightage or priority to the nodes in cluster.
>>
>> Eg., We want more more records to be written to first 2 nodes and less to
>> the 3rd node. We are thinking of this approach because we want to install
>> other IO intensive messaging server in the 3rd node, in order to reduce the
>> load we are requesting for this approach.
>>
>>
>> Thanks and Regards
>> A.SathishKumar
>>
>>
>> --
> Ben Bromhead
> CTO | Instaclustr <https://www.instaclustr.com/>
> +1 650 284 9692
> Managed Cassandra / Spark on AWS, Azure and Softlayer
>



-- 
A.SathishKumar
044-24735023


Priority for cassandra nodes in cluster

2016-11-12 Thread sat
Hi,

We are planning to install 3 node cluster in production environment. Is it
possible to provide weightage or priority to the nodes in cluster.

Eg., We want more more records to be written to first 2 nodes and less to
the 3rd node. We are thinking of this approach because we want to install
other IO intensive messaging server in the 3rd node, in order to reduce the
load we are requesting for this approach.


Thanks and Regards
A.SathishKumar


Re: ITrigger - Help

2016-11-11 Thread sat
Hi Siddharth Verma,

We explored this option, it seems it outputs the change only to a log file
and we cannot get notified to a listener class. Could you please provide us
what kind of information is pushed in the commit log and when we should
read commitlog. Do we need to instantiate CommitLogReader.java and read it
for every few seconds. Could you please provide us detailed
example/tutorial of how to use this.

Thanks and Regards
A.SathishKumar

On Fri, Nov 11, 2016 at 10:13 AM, siddharth verma <
sidd.verma29.l...@gmail.com> wrote:

> Hi Sathish,
> You could look into, Change Data Capture (CDC) (
> https://issues.apache.org/jira/browse/CASSANDRA-8844 .
> It might help you for some of your requirements.
>
> Regards
> Siddharth Verma
>
> On Fri, Nov 11, 2016 at 11:34 PM, Jonathan Haddad <j...@jonhaddad.com>
> wrote:
>
>> cqlsh uses the Python driver, I don't see how there would be any way to
>> differentiate where the request came from unless you stuck an extra field
>> in the table that you always write when you're not in cqlsh, or you
>> modified cqlsh to include that field whenever it did an insert.
>>
>> Checking iTrigger source, all you get is a reference to the ColumnFamily
>> and some metadata.  At a glance of trunk, it doesn't look like you get the
>> user that initiated the query.
>>
>> To be honest, I wouldn't do any of this, it feels like it's going to
>> become an error prone mess.  Your best bet is to layer something on top of
>> the driver yourself.  The cleanest way I think think of, long term, is to
>> submit a JIRA / patch to enable some class loading & listener hooks in
>> cqlsh itself.  Without a patch and a really good use case I don't know who
>> would want to maintain that though, as it would lock the team into using
>> Python for cqlsh.
>>
>> Jon
>>
>> On Fri, Nov 11, 2016 at 9:52 AM sat <sathish.al...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We are planning to use ITrigger to notify changes, when we execute
>>> scripts or run commands in cqlsh prompt. If the operation is performed
>>> through our application CRUD API, we are planning to handle notification in
>>> our CRUD API itself, however if user performs some operation(like write
>>> operation in cqlsh prompt) we want to handle those changes and update
>>> modules that are listening to those changes.
>>>
>>> Could you please let us know whether it is possible to differentiate
>>> updates done through cqlsh prompt and through application.
>>>
>>> We also thought about creating multiple users in cassandra and using
>>> different user for cqlsh and for the application. If we go with this
>>> approach, do we get the user who modified the table in ITrigger
>>> implementation (ie., augment method)
>>>
>>>
>>> Basically we are trying to limit/restrict usage of ITrigger just for
>>> cqlsh prompt as it is little complex and risky (came to know it will impact
>>> cassandra running in that node).
>>>
>>> Thanks and Regards
>>> A.SathishKumar
>>>
>>>
>
>
> --
> Siddharth Verma
> (Visit https://github.com/siddv29/cfs for a high speed cassandra full
> table scan)
>



-- 
A.SathishKumar
044-24735023


Re: ITrigger - Help

2016-11-11 Thread sat
Hi Jon,

Thanks for your prompt answer.

Thanks
A.SathishKumar

On Fri, Nov 11, 2016 at 10:04 AM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> cqlsh uses the Python driver, I don't see how there would be any way to
> differentiate where the request came from unless you stuck an extra field
> in the table that you always write when you're not in cqlsh, or you
> modified cqlsh to include that field whenever it did an insert.
>
> Checking iTrigger source, all you get is a reference to the ColumnFamily
> and some metadata.  At a glance of trunk, it doesn't look like you get the
> user that initiated the query.
>
> To be honest, I wouldn't do any of this, it feels like it's going to
> become an error prone mess.  Your best bet is to layer something on top of
> the driver yourself.  The cleanest way I think think of, long term, is to
> submit a JIRA / patch to enable some class loading & listener hooks in
> cqlsh itself.  Without a patch and a really good use case I don't know who
> would want to maintain that though, as it would lock the team into using
> Python for cqlsh.
>
> Jon
>
> On Fri, Nov 11, 2016 at 9:52 AM sat <sathish.al...@gmail.com> wrote:
>
>> Hi,
>>
>> We are planning to use ITrigger to notify changes, when we execute
>> scripts or run commands in cqlsh prompt. If the operation is performed
>> through our application CRUD API, we are planning to handle notification in
>> our CRUD API itself, however if user performs some operation(like write
>> operation in cqlsh prompt) we want to handle those changes and update
>> modules that are listening to those changes.
>>
>> Could you please let us know whether it is possible to differentiate
>> updates done through cqlsh prompt and through application.
>>
>> We also thought about creating multiple users in cassandra and using
>> different user for cqlsh and for the application. If we go with this
>> approach, do we get the user who modified the table in ITrigger
>> implementation (ie., augment method)
>>
>>
>> Basically we are trying to limit/restrict usage of ITrigger just for
>> cqlsh prompt as it is little complex and risky (came to know it will impact
>> cassandra running in that node).
>>
>> Thanks and Regards
>> A.SathishKumar
>>
>>


-- 
A.SathishKumar
044-24735023


ITrigger - Help

2016-11-11 Thread sat
Hi,

We are planning to use ITrigger to notify changes, when we execute scripts
or run commands in cqlsh prompt. If the operation is performed through our
application CRUD API, we are planning to handle notification in our CRUD
API itself, however if user performs some operation(like write operation in
cqlsh prompt) we want to handle those changes and update modules that are
listening to those changes.

Could you please let us know whether it is possible to differentiate
updates done through cqlsh prompt and through application.

We also thought about creating multiple users in cassandra and using
different user for cqlsh and for the application. If we go with this
approach, do we get the user who modified the table in ITrigger
implementation (ie., augment method)


Basically we are trying to limit/restrict usage of ITrigger just for cqlsh
prompt as it is little complex and risky (came to know it will impact
cassandra running in that node).

Thanks and Regards
A.SathishKumar


Re: Cassandra Triggers

2016-11-09 Thread sat
Hi,

We are doing POC on Cassandra for our business needs. We also need some
kind of notification when a column/attribute is modified
(insert/update/delete) of a table.

Thanks for sharing information about CDC. Could you please point us to some
example of how to implement this in Cassandra 3.9.


Thanks and Regards
A.SathishKumar

On Wed, Nov 9, 2016 at 6:18 AM, DuyHai Doan  wrote:

> They are production ready in the sens that they are fully functional. But
> using them require a *deep* knowledge of Cassandra Internal Write path and
> is dangerous because the write path is critical.
>
> Alternatively if you need a notification system of new mutation, there is
> a CDC feature, available since 3.9 only (maybe not production ready yet)
>
> On Wed, Nov 9, 2016 at 3:11 PM, Nethi, Manoj  wrote:
>
>> Hi,
>>
>> Are Triggers in  Cassandra production ready ?
>>
>> Version: Cassandra 3.3.0
>>
>>
>>
>> Thanks
>>
>> Manoj
>>
>>
>


-- 
A.SathishKumar
044-24735023


Re: Designing a table in cassandra

2016-11-07 Thread sat
Hi Carlos Alonso,

Thanks for your quick answer.

Thanks and Regards
A.SathishKumar

On Mon, Nov 7, 2016 at 2:26 AM, Carlos Alonso <i...@mrcalonso.com> wrote:

> Hi,
>
> I think your best bet is, as usual, the simplest one that can work, which,
> to me, in this case is the 3rd one. Creating one single device table that
> contains the different 'versions' of the configuration during time, along
> with a flag to know wether it was updated by user or by network gives you
> all the flexibility you need. The primary key you suggest sounds good to me.
>
> To finally validate the model it would be good to know which are the
> queries you're thinking of running against this model because as you
> probably know, Cassandra models should be query driven.
>
> The suggested primary key will work for queries like "Give me the
> version(s) of this particular device_name in this particular time range"
>
> Hope it helps.
>
> Regards
>
> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>
> On 7 November 2016 at 01:23, sat <sathish.al...@gmail.com> wrote:
>
>> Hi,
>>
>> We are new to Cassandra. For our POC, we tried creating table and
>> inserting them as JSON and all these went fine. Now we are trying to
>> implement one of the application scenario, and I am having difficulty in
>> coming up with the best approach.
>>
>> Scenario:
>> We have a Device POJO which have some attributes/fields which are
>> read/write by users as well as network and some attributes/fields only
>> network can modify. When users need to configure they will create an
>> instance of Device POJO and set/configure applicable fields, however
>> network can update those attributes. We wanted to know the discrepancy by
>> the values configured by users versus the values updated by network. Hence
>> we have thought of 3 different approaches
>>
>> 1) Create multiple tables for the same Device like Device_Users and
>> Device_Network so that we can see the difference.
>>
>> 2) Create different Keyspace as multiple objects like Device can have the
>> same requirement
>>
>> 3) Create one "Device" table and insert one row for user configuration
>> and another row for network update. We will create this table with multiple
>> primary key (device_name, updated_by)
>>
>> Please let us know which is the best option (with their pros and cons if
>> possible) among these 3, and also let us know if there are other options.
>>
>> Thanks and Regards
>> A.SathishKumar
>>
>
>


-- 
A.SathishKumar
044-24735023


Designing a table in cassandra

2016-11-06 Thread sat
Hi,

We are new to Cassandra. For our POC, we tried creating table and inserting
them as JSON and all these went fine. Now we are trying to implement one of
the application scenario, and I am having difficulty in coming up with the
best approach.

Scenario:
We have a Device POJO which have some attributes/fields which are
read/write by users as well as network and some attributes/fields only
network can modify. When users need to configure they will create an
instance of Device POJO and set/configure applicable fields, however
network can update those attributes. We wanted to know the discrepancy by
the values configured by users versus the values updated by network. Hence
we have thought of 3 different approaches

1) Create multiple tables for the same Device like Device_Users and
Device_Network so that we can see the difference.

2) Create different Keyspace as multiple objects like Device can have the
same requirement

3) Create one "Device" table and insert one row for user configuration and
another row for network update. We will create this table with multiple
primary key (device_name, updated_by)

Please let us know which is the best option (with their pros and cons if
possible) among these 3, and also let us know if there are other options.

Thanks and Regards
A.SathishKumar