Disk full during new node bootstrap

2017-02-03 Thread techpyaasa .
Hi,

We are using c* 2.0.17 , 2 DCs , RF=3.

When I try to add new node to one group in a DC , I got disk full. Can
someone please tell what is the best way to resolve this?

Run compaction for nodes in that group(to which I'm going to add new node,
as data streams to new nodes from nodes of group to which it is added)

OR

Boootstrap/add  2(multiple nodes) at a time?


Please suggest better way to fix this.

Thanks in advance
Techpyaasa


Re: Disk full during new node bootstrap

2017-02-04 Thread techpyaasa .
Cluster Information:
Name:  Cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
397560b8-7245-3903-8828-60a97e5be4aa: [xxx.xxx.xxx.75, xxx.xxx.xxx.134,
xxx.xxx.xxx.192, xxx.xxx.xxx.132, xxx.xxx.xxx.133, xxx.xxx.xxx.115,
xxx.xxx.xxx.78, xxx.xxx.xxx.123, xxx.xxx.xxx.70, xxx.xxx.xxx.167,
xxx.xxx.xxx.168, xxx.xxx.xxx.169, xxx.xxx.xxx.146, xxx.xxx.xxx.145,
xxx.xxx.xxx.144, xxx.xxx.xxx.143, xxx.xxx.xxx.140, xxx.xxx.xxx.139,
xxx.xxx.xxx.126, xxx.xxx.xxx.136, xxx.xxx.xxx.135, xxx.xxx.xxx.191,
xxx.xxx.xxx.133, xxx.xxx.xxx.79, xxx.xxx.xxx.131, xxx.xxx.xxx.77]

ReleaseVersion: 2.0.17
---
Note: Ownership information does not include topology; for complete
information, specify a keyspace
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns   Host ID
  Rack
UN  xxx.xxx.xxx.168  847.62 GB  256 4.1%
2302491a-a8b5-4aa6-bda7-f1544064c4e3  GRP3
UN  xxx.xxx.xxx.169  819.64 GB  256 4.2%
d5e5bc3d-38de-4043-abca-08ac09f29a46  GRP1
UN  xxx.xxx.xxx.75   874.69 GB  256 4.1%
fdd32c67-3cea-4174-b59b-c1ea14e1a334  GRP1
UN  xxx.xxx.xxx.78   850.07 GB  256 4.0%
a8332f22-a75f-4d7c-8b71-7284f6fe208f  GRP3
UN  xxx.xxx.xxx.126  836.88 GB  256 4.0%
71be90d8-97db-4155-b4fc-da59d78331ef  GRP1
UN  xxx.xxx.xxx.191  751.08 GB  256 4.1%
a9023df8-a8b3-484b-a03d-0fdea35007bd  GRP3
UN  xxx.xxx.xxx.192  888.03 GB  256 3.8%
f4ad42d5-cee0-4d3e-a4f1-7cdeb5d7390a  GRP2
UN  xxx.xxx.xxx.132  688.86 GB  256 3.8%
6a465101-29e7-4792-8269-851200a70023  GRP2
UN  xxx.xxx.xxx.133  855.66 GB  256 4.0%
751ce15a-10f1-44cf-9357-04da7e21b511  GRP2
UN  xxx.xxx.xxx.134  869.32 GB  256 3.7%
bdd166fd-95a7-4119-bbae-f05fe26ddb01  GRP3
UN  xxx.xxx.xxx.70   792.15 GB  256 4.2%
2b6b642d-6842-47d4-bdc1-95226fd2b85d  GRP1
UN  xxx.xxx.xxx.167  732.82 GB  256 4.0%
45f6684f-d6a0-4cba-875c-9db459646545  GRP2
Datacenter: DC2
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns   Host ID
  Rack
UN  xxx.xxx.xxx.136   762.85 GB  256 3.9%
ebc67006-80e6-40de-95a9-79b90b254750  GRP3
UN  xxx.xxx.xxx.139   807.68 GB  256 4.5%
e27ca655-3186-417f-a927-cab63dc34248  GRP3
UN  xxx.xxx.xxx.140   771.94 GB  256 4.1%
26531fe3-7f2c-4ce5-a41b-5c79c5976141  GRP1
UN  xxx.xxx.xxx.77505.54 GB  256 3.5%
d1ad7194-d7fb-47cf-92d1-4206ff12f8aa  GRP1
UN  xxx.xxx.xxx.143   900.14 GB  256 4.1%
74e1009c-0506-4d7a-b517-d37182385a21  GRP2
UJ  xxx.xxx.xxx.79636.08 GB  256 ?
 91b64758-67c2-48e7-86eb-43f7509c2287  GRP3
UN  xxx.xxx.xxx.131   788 GB 256 4.0%
5b27a680-d7c0-4ead-85cc-c295b83eda5b  GRP2
UN  xxx.xxx.xxx.133   898.27 GB  256 3.8%
5b24f211-678e-4614-bd59-8ea13aa2397c  GRP1
UN  xxx.xxx.xxx.135   868.14 GB  256 4.1%
8c2b5d1c-e43e-41f4-b21e-58cb525cfbab  GRP2
UN  xxx.xxx.xxx.123   848.86 GB  256 4.0%
87cfff8f-1cfc-44c5-b608-5b40d6894182  GRP3
UN  xxx.xxx.xxx.144   830.99 GB  256 3.6%
31b8cf4b-dd08-4ee6-8c25-90ad6dadbdc4  GRP3
UN  xxx.xxx.xxx.145   832.22 GB  256 4.3%
dd8c97df-7ec9-436b-8b29-4c25a8a89184  GRP1
UN  xxx.xxx.xxx.146   830.02 GB  256 4.2%
88b52574-8569-4d58-ba43-8fe1c742eea4  GRP2
UN  xxx.xxx.xxx.115   878.5 GB   256 3.9%
20817b9e-b761-437e-aa7b-49e90483c69f  GRP1




Total keyspaces 5  with 'class': 'NetworkTopologyStrategy',   and
replication with 'DC2': '3',   'DC1': '3'

On Sat, Feb 4, 2017 at 3:22 PM, Alexander Dejanovski <a...@thelastpickle.com
> wrote:

> Hi,
>
> could you share with us the following informations ?
>
> - "nodetool status" output
> - Keyspace definitions (we need to check the replication strategy you're
> using on all keyspaces)
> - Specifics about what you're calling "groups" in a DC. Are these racks ?
>
> Thanks
>
> On Sat, Feb 4, 2017 at 10:41 AM laxmikanth sadula <laxmikanth...@gmail.com>
> wrote:
>
>> Yes .. same number of tokens...
>> 256
>>
>> On Sat, Feb 4, 2017 at 11:56 AM, Jonathan Haddad <j...@jonhaddad.com>
>> wrote:
>>
>> Are you using the same number of tokens on the new node as the old ones?
>>
>> On Fri, Feb 3, 2017 at 8:31 PM techpyaasa . <techpya...@gmail.com> wrote:
>>
>> Hi,
>>
>> We are using c* 2.0.17 , 2 DCs , RF=3.
>>
>> When I try to add new node to one group in a DC , I got disk full. Can
>> someone please tell what is the best way to resolve this?
>>
>> Run compaction for nodes in that group(to which I'm going to add new
>> node, as data streams to new nodes from nodes of group to which it is added)
>>
>> OR
>>
>> Boootstrap/add  2(multiple nodes) at a time?
>>
>>
>> Please suggest better way 

Re: New node block in autobootstrap

2016-09-28 Thread techpyaasa .
Very sorry...I got the reason for this issue..
Please ignore.


On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa . <techpya...@gmail.com> wrote:

> @Paulo
>
> We have done changes as you said
> net.ipv4.tcp_keepalive_time=60
> net.ipv4.tcp_keepalive_probes=3
> net.ipv4.tcp_keepalive_intvl=10
>
> and increased streaming_socket_timeout_in_ms to 48 hours ,
> "phi_convict_threshold : 9".
>
> And once again recommissioned new data center (DC3)  , ran " nodetool
> rebuild 'DC1' " , but this time NO data got streamed and 'nodetool rebuild'
> got exit without any exception.
>
> Please check logs below
>
> *INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:44,571
> StorageService.java (line 914) rebuild from dc: IDC*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,520
> StreamResultFuture.java (line 87) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Executing streaming plan for Rebuild*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,521
> StreamResultFuture.java (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /xxx.xxx.198.75*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522
> StreamResultFuture.java (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /xxx.xxx.198.132*
> * INFO [StreamConnectionEstablisher:1] 2016-09-28 09:18:47,522
> StreamSession.java (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /xxx.xxx.198.75*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522
> StreamResultFuture.java (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /xxx.xxx.198.133*
> * INFO [StreamConnectionEstablisher:2] 2016-09-28 09:18:47,522
> StreamSession.java (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /xxx.xxx.198.132*
> * INFO [StreamConnectionEstablisher:3] 2016-09-28 09:18:47,523
> StreamSession.java (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /xxx.xxx.198.133*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,523
> StreamResultFuture.java (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /xxx.xxx.198.167*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524
> StreamResultFuture.java (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /xxx.xxx.198.78*
> * INFO [StreamConnectionEstablisher:4] 2016-09-28 09:18:47,524
> StreamSession.java (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /xxx.xxx.198.167*
> * INFO [StreamConnectionEstablisher:5] 2016-09-28 09:18:47,525
> StreamSession.java (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /xxx.xxx.198.78*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524
> StreamResultFuture.java (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /xxx.xxx.198.126*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,525
> StreamResultFuture.java (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /xxx.xxx.198.191*
> * INFO [StreamConnectionEstablisher:6] 2016-09-28 09:18:47,526
> StreamSession.java (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /xxx.xxx.198.126*
> * INFO [StreamConnectionEstablisher:7] 2016-09-28 09:18:47,526
> StreamSession.java (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /xxx.xxx.198.191*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,526
> StreamResultFuture.java (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /xxx.xxx.198.168*
> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,527
> StreamResultFuture.java (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /xxx.xxx.198.169*
> * INFO [StreamConnectionEstablisher:8] 2016-09-28 09:18:47,527
> StreamSession.java (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /xxx.xxx.198.168*
> * INFO [StreamConnectionEstablisher:9] 2016-09-28 09:18:47,528
> StreamSession.java (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /xxx.xxx.198.169*
> * INFO [STREAM-IN-/xxx.xxx.198.132] 2016-09-28 09:18:47,713
> StreamResultFuture.java (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.132 is
> complete*
>

Re: New node block in autobootstrap

2016-09-28 Thread techpyaasa .
@Paulo

We have done changes as you said
net.ipv4.tcp_keepalive_time=60
net.ipv4.tcp_keepalive_probes=3
net.ipv4.tcp_keepalive_intvl=10

and increased streaming_socket_timeout_in_ms to 48 hours ,
"phi_convict_threshold : 9".

And once again recommissioned new data center (DC3)  , ran " nodetool
rebuild 'DC1' " , but this time NO data got streamed and 'nodetool rebuild'
got exit without any exception.

Please check logs below

*INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:44,571
StorageService.java (line 914) rebuild from dc: IDC*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,520
StreamResultFuture.java (line 87) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Executing streaming plan for Rebuild*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,521
StreamResultFuture.java (line 91) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
/xxx.xxx.198.75*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522
StreamResultFuture.java (line 91) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
/xxx.xxx.198.132*
* INFO [StreamConnectionEstablisher:1] 2016-09-28 09:18:47,522
StreamSession.java (line 214) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
/xxx.xxx.198.75*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522
StreamResultFuture.java (line 91) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
/xxx.xxx.198.133*
* INFO [StreamConnectionEstablisher:2] 2016-09-28 09:18:47,522
StreamSession.java (line 214) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
/xxx.xxx.198.132*
* INFO [StreamConnectionEstablisher:3] 2016-09-28 09:18:47,523
StreamSession.java (line 214) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
/xxx.xxx.198.133*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,523
StreamResultFuture.java (line 91) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
/xxx.xxx.198.167*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524
StreamResultFuture.java (line 91) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
/xxx.xxx.198.78*
* INFO [StreamConnectionEstablisher:4] 2016-09-28 09:18:47,524
StreamSession.java (line 214) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
/xxx.xxx.198.167*
* INFO [StreamConnectionEstablisher:5] 2016-09-28 09:18:47,525
StreamSession.java (line 214) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
/xxx.xxx.198.78*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524
StreamResultFuture.java (line 91) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
/xxx.xxx.198.126*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,525
StreamResultFuture.java (line 91) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
/xxx.xxx.198.191*
* INFO [StreamConnectionEstablisher:6] 2016-09-28 09:18:47,526
StreamSession.java (line 214) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
/xxx.xxx.198.126*
* INFO [StreamConnectionEstablisher:7] 2016-09-28 09:18:47,526
StreamSession.java (line 214) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
/xxx.xxx.198.191*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,526
StreamResultFuture.java (line 91) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
/xxx.xxx.198.168*
* INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,527
StreamResultFuture.java (line 91) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
/xxx.xxx.198.169*
* INFO [StreamConnectionEstablisher:8] 2016-09-28 09:18:47,527
StreamSession.java (line 214) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
/xxx.xxx.198.168*
* INFO [StreamConnectionEstablisher:9] 2016-09-28 09:18:47,528
StreamSession.java (line 214) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
/xxx.xxx.198.169*
* INFO [STREAM-IN-/xxx.xxx.198.132] 2016-09-28 09:18:47,713
StreamResultFuture.java (line 186) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.132 is
complete*
* INFO [STREAM-IN-/xxx.xxx.198.191] 2016-09-28 09:18:47,715
StreamResultFuture.java (line 186) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.191 is
complete*
* INFO [STREAM-IN-/xxx.xxx.198.133] 2016-09-28 09:18:47,716
StreamResultFuture.java (line 186) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.133 is
complete*
* INFO [STREAM-IN-/xxx.xxx.198.169] 2016-09-28 09:18:47,716
StreamResultFuture.java (line 186) [Stream
#3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.169 is
complete*
* INFO [STREAM-IN-/xxx.xxx.198.167] 2016-09-28 09:18:47,715
StreamResultFuture.java (line 186) 

Re: New node block in autobootstrap

2016-09-28 Thread techpyaasa .
Forgot to set replication for new data center :(

On Wed, Sep 28, 2016 at 11:33 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> What was the reason?
>
> On Wed, Sep 28, 2016 at 9:58 AM techpyaasa . <techpya...@gmail.com> wrote:
>
>> Very sorry...I got the reason for this issue..
>> Please ignore.
>>
>>
>> On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa . <techpya...@gmail.com>
>> wrote:
>>
>>> @Paulo
>>>
>>> We have done changes as you said
>>> net.ipv4.tcp_keepalive_time=60
>>> net.ipv4.tcp_keepalive_probes=3
>>> net.ipv4.tcp_keepalive_intvl=10
>>>
>>> and increased streaming_socket_timeout_in_ms to 48 hours ,
>>> "phi_convict_threshold : 9".
>>>
>>> And once again recommissioned new data center (DC3)  , ran " nodetool
>>> rebuild 'DC1' " , but this time NO data got streamed and 'nodetool rebuild'
>>> got exit without any exception.
>>>
>>> Please check logs below
>>>
>>> *INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:44,571
>>> StorageService.java (line 914) rebuild from dc: IDC*
>>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,520
>>> StreamResultFuture.java (line 87) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Executing streaming plan for Rebuild*
>>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,521
>>> StreamResultFuture.java (line 91) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
>>> /xxx.xxx.198.75*
>>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522
>>> StreamResultFuture.java (line 91) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
>>> /xxx.xxx.198.132*
>>> * INFO [StreamConnectionEstablisher:1] 2016-09-28 09:18:47,522
>>> StreamSession.java (line 214) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
>>> /xxx.xxx.198.75*
>>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522
>>> StreamResultFuture.java (line 91) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
>>> /xxx.xxx.198.133*
>>> * INFO [StreamConnectionEstablisher:2] 2016-09-28 09:18:47,522
>>> StreamSession.java (line 214) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
>>> /xxx.xxx.198.132*
>>> * INFO [StreamConnectionEstablisher:3] 2016-09-28 09:18:47,523
>>> StreamSession.java (line 214) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
>>> /xxx.xxx.198.133*
>>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,523
>>> StreamResultFuture.java (line 91) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
>>> /xxx.xxx.198.167*
>>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524
>>> StreamResultFuture.java (line 91) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
>>> /xxx.xxx.198.78*
>>> * INFO [StreamConnectionEstablisher:4] 2016-09-28 09:18:47,524
>>> StreamSession.java (line 214) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
>>> /xxx.xxx.198.167*
>>> * INFO [StreamConnectionEstablisher:5] 2016-09-28 09:18:47,525
>>> StreamSession.java (line 214) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
>>> /xxx.xxx.198.78*
>>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524
>>> StreamResultFuture.java (line 91) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
>>> /xxx.xxx.198.126*
>>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,525
>>> StreamResultFuture.java (line 91) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
>>> /xxx.xxx.198.191*
>>> * INFO [StreamConnectionEstablisher:6] 2016-09-28 09:18:47,526
>>> StreamSession.java (line 214) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
>>> /xxx.xxx.198.126*
>>> * INFO [StreamConnectionEstablisher:7] 2016-09-28 09:18:47,526
>>> StreamSession.java (line 214) [Stream
>>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
>>> /xxx.xxx.198.191*
>>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,526
>>> Strea

nodetool rebuild streaming exception

2016-09-27 Thread techpyaasa .
)at
sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)at
sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)at
sun.nio.ch.IOUtil.write(IOUtil.java:65)at
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)at
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)
at
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319)
at java.lang.Thread.run(Thread.java:745)DEBUG [STREAM-IN-/xxx.xxx.98.168]
2016-09-27 04:24:49,909 ConnectionHandler.java (line 244) [Stream
#5e1b7f40-8496-11e6-8847-1b88665e430d] Received File (Header (cfId:
68af9ee0-96f8-3b1d-a418-e5ae844f2cc2, #3, version: jb, estimated keys:
4736, transfer size: 2306880, compressed?: true), file:
/home/cassandra/data_directories/data/keyspace_name1/archiving_metadata/keyspace_name1-archiving_metadata-tmp-jb-27-Data.db)ERROR
[STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,909 StreamSession.java
(line 461) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error
occurredjava.lang.RuntimeException: Outgoing stream handler has been
closedat
org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:126)
at
org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:524)
at
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:413)
at
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
at java.lang.Thread.run(Thread.java:745)*


I checked with our network operations team , they have confirmed network is
stable and no network hiccups.
I have set 'streaming_socket_timeout_in_ms: 8640' (24 hours) as
suggested in datastax blog  -
https://support.datastax.com/hc/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of-streaming-errors-or-failures
and ran 'nodetool rebuild' one node at a time but was of NO USE . Still we
are getting above exception.

Can someone please help me in debugging and fixing this.


Thanks,
techpyaasa


Difference in token range count

2016-09-30 Thread techpyaasa .
Hi ,

We have c*-2.0.17  with 3 data centers . Each data center has 9 nodes.
vnodes enabled in all nodes.

When I ran -local repair(./nodetool -local repair keyspace_name1
columnfamily_1) on one of data center I saw following print

"Starting repair command #3, repairing *2647 ranges* for keyspace
keyspace_name1"

The count of ranges , it is supposed to be *2304*(256*9) as we have 9
nodes in one data center right but why it is showing as 2647 ranges ??

Can someone please clarify why this difference in token ranges count?

Thanks
techpyaasa


Cannot set TTL in COPY command

2016-10-26 Thread techpyaasa .
Hi all,

I'm getting following exception when I try to set TTL using COPY command,
where as it is working fine without TTL option. Followed doc at
https://docs.datastax.com/en/cql/3.1/cql/cql_reference/copy_r.html

*"**Improper COPY command.**"*

Command used as below

*COPY keyspace1.columnFamily1 FROM 'dump_data.csv' WITH TTL = '7200';*


Version is as below

*[cqlsh 4.1.1 | Cassandra 2.0.17 | CQL spec 3.1.1 | Thrift protocol
19.39.0]*


Can some one please tell me how to set TTL using COPY command?

Thanks
Techpyaasa


Re: Cannot set TTL in COPY command

2016-10-26 Thread techpyaasa .
And to add to above email , its already existing table with some data and
NOT new table.
Is it ok to change table property default_time_to_live now?

On Wed, Oct 26, 2016 at 9:06 PM, laxmikanth sadula <laxmikanth...@gmail.com>
wrote:

> You mean to say instead of
> *COPY keyspace1.columnFamily1 FROM 'dump_data.csv' WITH TTL = '7200';*use
>
>
> *COPY keyspace1.columnFamily1 FROM 'dump_data.csv'
> WITH DEFAULT_TIME_TO_LIVE = '7200';*
> I tried this way too, but again exception thrown saying "*Unrecognized
> COPY FROM options: default_time_to_live*" :( :(
>
> On Wed, Oct 26, 2016 at 8:53 PM, Lahiru Gamathige <lah...@highfive.com>
> wrote:
>
>> You have to use with default_time_to_live = 7200.
>>
>> On Wed, Oct 26, 2016 at 8:07 AM, techpyaasa . <techpya...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm getting following exception when I try to set TTL using COPY
>>> command, where as it is working fine without TTL option. Followed doc at
>>> https://docs.datastax.com/en/cql/3.1/cql/cql_reference/copy_r.html
>>>
>>> *"**Improper COPY command.**"*
>>>
>>> Command used as below
>>>
>>> *COPY keyspace1.columnFamily1 FROM 'dump_data.csv' WITH TTL = '7200';*
>>>
>>>
>>> Version is as below
>>>
>>> *[cqlsh 4.1.1 | Cassandra 2.0.17 | CQL spec 3.1.1 | Thrift protocol
>>> 19.39.0]*
>>>
>>>
>>> Can some one please tell me how to set TTL using COPY command?
>>>
>>> Thanks
>>> Techpyaasa
>>>
>>
>>
>
>
> --
> Regards,
> Laxmikanth
> 99621 38051
>
>


OperationTimedOutException (NoHostAvailableException)

2016-11-24 Thread techpyaasa .
Hi all,

Following exception thrown sometimes though all nodes are up.


* SEVERE : This error occurs if there are not enough Cassandra nodes for
the required QUORUM to persist data. Please make sure enough nodes are up
at this point of time. Error Count is at 150 Exception
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
tried for query failed (tried: /192.168.198.168:9042
<http://192.168.198.168:9042>
(com.datastax.driver.core.exceptions.DriverException: Timeout while trying
to acquire available connection (you may want to increase the driver number
of per-host connections)), /192.168.198.169:9042
<http://192.168.198.169:9042>
(com.datastax.driver.core.exceptions.DriverException: Timeout while trying
to acquire available connection (you may want to increase the driver number
of per-host connections)), /192.168.198.75:9042
<http://192.168.198.75:9042>
(com.datastax.driver.core.OperationTimedOutException: [/192.168.198.75:9042
<http://192.168.198.75:9042>] Operation timed out)) at
com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
at
com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at
com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
at
com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
at *

We are using c*-2.0.17 , datastax java driver - cassandra-driver-core-2.1.8.
jar.

In cassandra.yaml following were set
rpc_address: 0.0.0.0  , broadcast_address: 1.2.3.4

This exception thrown for both READ & WRITE queries. Can someone please
help me out in debugging things?


Thanks
Techpyaasa


Re: OperationTimedOutException (NoHostAvailableException)

2016-11-24 Thread techpyaasa .
I tried , that didnt work out.. :(

On Thu, Nov 24, 2016 at 4:49 PM, Vladimir Yudovin <vla...@winguzone.com>
wrote:

> >rpc_address: 0.0.0.0  , broadcast_address: 1.2.3.4
> Did you try set rpc_address to node IP and not to 0.0.0.0 ?
>
> Best regards, Vladimir Yudovin,
> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>
>
>  On Thu, 24 Nov 2016 04:50:08 -0500*Jeff Jirsa
> <jeff.ji...@crowdstrike.com <jeff.ji...@crowdstrike.com>>* wrote 
>
> Did you already try doing what the error message indicates you should try?
>
>
>
> Is there anything in the logs on the 3 cassandra boxes listed
> (192.168.198.168, 192.168.198.169, 192.168.198.75) that indicates they had
> problems at that time, perhaps GCInspector or StatusLogger messages about
> pauses, or any drops in network utilization to indicate a networking
> problem?
>
>
>
>
>
>
> *From: *"techpyaasa ." <techpya...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Thursday, November 24, 2016 at 1:43 AM
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *OperationTimedOutException (NoHostAvailableException)
>
>
>
> Hi all,
>
> Following exception thrown sometimes though all nodes are up.
>
>
> * SEVERE : This error occurs if there are not enough Cassandra nodes for
> the required QUORUM to persist data. Please make sure enough nodes are up
> at this point of time. Error Count is at 150 Exception
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
> tried for query failed (tried: /192.168.198.168:9042
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.198.168-3A9042=DgMFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I=kX9tE8vPTqL-rVGMeZYiH9aQoxDvhJo8goYI5u9vgxY=>
> (com.datastax.driver.core.exceptions.DriverException: Timeout while trying
> to acquire available connection (you may want to increase the driver number
> of per-host connections)), /192.168.198.169:9042
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.198.169-3A9042=DgMFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I=smRB1jLV0OZ9xfSHq_BNF-q_e8T6rjozvvlqAxpJV_I=>
> (com.datastax.driver.core.exceptions.DriverException: Timeout while trying
> to acquire available connection (you may want to increase the driver number
> of per-host connections)), /192.168.198.75:9042
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.198.75-3A9042=DgMFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I=c5l0qvjdeL8FlWVyq-AEs3zRJpOxcBPSxT8WMthdz40=>
> (com.datastax.driver.core.OperationTimedOutException: [/192.168.198.75:9042
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.198.75-3A9042=DgMFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I=c5l0qvjdeL8FlWVyq-AEs3zRJpOxcBPSxT8WMthdz40=>]
> Operation timed out)) at
> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
> at
> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
> at
> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
> at
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
> at *
>
> We are using c*-2.0.17 , datastax java driver -
> cassandra-driver-core-2.1.8.jar.
>
>
> In cassandra.yaml following were set
> rpc_address: 0.0.0.0
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__0.0.0.0=DgQFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I=hvPmmPUl-vLcMbSILg7KMAfvLaQT1spfbEupoCwoCJQ=>
> , broadcast_address: 1.2.3.4
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__1.2.3.4=DgQFaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I=DryS9p3xHtABKD_pAJTKfbG3E-ATyEFFCvKDtZ6aRVs=>
>
> This exception thrown for both READ & WRITE queries. Can someone please
> help me out in debugging things?
>
>
> Thanks
> Techpyaasa
>
>
>


Re: Can nodes in c* cluster run different versions ?

2016-11-16 Thread techpyaasa .
Thank you @Alain

On Wed, Nov 16, 2016 at 9:13 PM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:

> Hey Techpyaasa,
>
> Are you aware of this documentation?
>
> https://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/
> upgrdCassandraDetails.html
>
> Basically yes, you can have multiple versions, but you want to make this
> multi-version time it as short as possible.
>
> As it might take few days as it will have 'upgrade sstables' , so just
>> wanted to know would be there any possibility during this mismatch of c*
>> version among nodes in cluster during this upgrade process?
>
>
> To make the upgrade as short as possible I use to migrate the whole
> cluster and only after upgrade sstables. This minimises the time you're
> running multiple version.
>
> The only (soft) limitation, is that you shouldn't be streaming data around
> (do not bootstrap, repair or remove a node). It might work, that's why I
> say "soft limitation" but it might not work, there is no guarantee, it
> mainly depends on changes that were made between versions I believe.
>
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
>
> 2016-11-16 11:12 GMT+01:00 techpyaasa . <techpya...@gmail.com>:
>
>> Hi all,
>>
>> We are currently running c*-2.0.17 with 2 datacenters each with 18 nodes.
>>
>> We like to upgrade to c*-2.1.16. Can we upgrade first all nodes(one by
>> one) in one dc and then go to next data center.
>>
>> As it might take few days as it will have 'upgrade sstables' , so just
>> wanted to know would be there any possibility during this mismatch of c*
>> version among nodes in cluster during this upgrade process?
>>
>> Thanks
>> Techpyaasa
>>
>
>


UDF/UDA for json data aggregation

2016-11-16 Thread techpyaasa .
Hi all,

Like to use UDF/UDA in c*-2.2 & above to aggregate attibutes values in a
json data.

For example below table.


*"CREATE  TABLE test ( id bigint , time1 bigint , jsonData text , PRIMARY
KEY(id,time1));*



*cqlsh:test> INSERT INTO test (id , time1 , jsonData ) VALUES ( 1, 123,
'{"node1":{"attr1":"91","attr2":"1","attr3":"333"},"node2":{"attr4":"1.01","attr5":"1.231","attr6":"1.12"}}');*


*cqlsh:test> INSERT INTO test (id , time1 , jsonData ) VALUES ( 2, 345,
'{"node1":{"attr1":"22","attr2":"4","attr3":"111"},"node2":{"attr4":"2.01","attr5":"3.231","attr6":"2.112"}}');*
*cqlsh:test> INSERT INTO test (id , time1 , jsonData ) VALUES ( 3, 333,
'{"node1":{"attr1":"17","attr2":"56","attr3":"167"},"node2":{"attr4":"1.11","attr5":"2.31","attr6":"3.112"}}');"*


Using UDF/UDA , I want attributes values of json data to be aggregated
something like below

*"select json_aggr(jsonData) from test; //SUM*

*{"node1":{"attr1":"130","attr2":"61","attr3":"611"},"node2":{"attr4":"4.13","attr5":"6.772","attr6":"6.344"}}"*

//130=91+22+17 etc.,


Can we do something like this , can we import java classes from thirdparty
jars for parsing/updating json in UDF defined? And how?
Can somebody please provide outline code for this?

Thanks,
Techpyaasa


Can nodes in c* cluster run different versions ?

2016-11-16 Thread techpyaasa .
Hi all,

We are currently running c*-2.0.17 with 2 datacenters each with 18 nodes.

We like to upgrade to c*-2.1.16. Can we upgrade first all nodes(one by one)
in one dc and then go to next data center.

As it might take few days as it will have 'upgrade sstables' , so just
wanted to know would be there any possibility during this mismatch of c*
version among nodes in cluster during this upgrade process?

Thanks
Techpyaasa


NoHostAvailableException

2016-11-21 Thread techpyaasa .
Following exception intermittently thrown by datastax java driver though
all nodes are up.(Happening for both read & write queries)

*"Exception com.datastax.driver.core.exceptions.NoHostAvailableException:
All host(s) tried for query failed (no host was tried) at
com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
at
com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at
com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
at
com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
at"*

Using c*-2.0.17 , datastax java driver - cassandra-driver-core-2.1.8.jar.

In cassandra.yaml following were set
rpc_address: 0.0.0.0  , broadcast_rpc_address: 1.2.3.4

Have anyone faced such issue ? What could be the reason and fix for it?

Thanks in advance


Techpyaasa.


Re: NoHostAvailableException

2016-11-21 Thread techpyaasa .
Hi Vladimir,

I have not modified anything for broadcasr_address, I left as it was..

*# Leaving this blank will set it to the same value as listen_address*
*# broadcast_address: 1.2.3.4*

So the  comment above says "*Leaving this blank will set it to the same
value as listen_address" *, so it shud set as listen_address and I have set
listen_address as its external IPs for all nodes..
So I guess that should not be a problem... :(

What else could be the issue...??  :( :(

On Mon, Nov 21, 2016 at 4:21 PM, Vladimir Yudovin <vla...@winguzone.com>
wrote:

> Try to set *broadcast_rpc_address* on each node to its real external IP
> address.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Mon, 21 Nov 2016 05:47:00 -0500*techpyaasa .
> <techpya...@gmail.com <techpya...@gmail.com>>* wrote 
>
> Following exception intermittently thrown by datastax java driver though
> all nodes are up.(Happening for both read & write queries)
>
> *"Exception com.datastax.driver.core.exceptions.NoHostAvailableException:
> All host(s) tried for query failed (no host was tried) at
> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
> at
> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
> at
> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
> at
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
> at"*
>
> Using c*-2.0.17 , datastax java driver - cassandra-driver-core-2.1.8.jar.
>
> In cassandra.yaml following were set
> rpc_address: 0.0.0.0  , broadcast_rpc_address: 1.2.3.4
>
> Have anyone faced such issue ? What could be the reason and fix for it?
>
> Thanks in advance
>
>
> Techpyaasa.
>
>
>


Re: NoHostAvailableException

2016-11-21 Thread techpyaasa .
Sorry it was typo..
It is *broadcast_address and not **broadcast_*rpc*_address.*
And also there is no such configuration in cass.yaml with
*broadcast_rpc_address
*in c*-2.0.17.
Very sorry once again.

This is configrn I have in cass.yaml



*listen_address: [external IP]*
*# Address to broadcast to other Cassandra nodes*
*# Leaving this blank will set it to the same value as listen_address*



*# broadcast_address: 1.2.3.4 #It is commented, I have not made any changes
for itrpc_address: 0.0.0.0*
*rpc_port: 9160*
Thanks
TechPyaasa




On Mon, Nov 21, 2016 at 6:48 PM, Vladimir Yudovin <vla...@winguzone.com>
wrote:

> Not *broadcast_address*, but *broadcast_rpc_address*  (you gave this
> example:rpc_address: 0.0.0.0  , broadcast_rpc_address: 1.2.3.4)
>
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Mon, 21 Nov 2016 08:14:38 -0500*techpyaasa .
> <techpya...@gmail.com <techpya...@gmail.com>>* wrote 
>
> Hi Vladimir,
>
> I have not modified anything for broadcasr_address, I left as it was..
>
> *# Leaving this blank will set it to the same value as listen_address*
> *# broadcast_address: 1.2.3.4*
>
> So the  comment above says "*Leaving this blank will set it to the same
> value as listen_address" *, so it shud set as listen_address and I have
> set listen_address as its external IPs for all nodes..
> So I guess that should not be a problem... :(
>
> What else could be the issue...??  :( :(
>
> On Mon, Nov 21, 2016 at 4:21 PM, Vladimir Yudovin <vla...@winguzone.com>
> wrote:
>
>
> Try to set *broadcast_rpc_address* on each node to its real external IP
> address.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Mon, 21 Nov 2016 05:47:00 -0500*techpyaasa .
> <techpya...@gmail.com <techpya...@gmail.com>>* wrote 
>
> Following exception intermittently thrown by datastax java driver though
> all nodes are up.(Happening for both read & write queries)
>
> *"Exception com.datastax.driver.core.exceptions.NoHostAvailableException:
> All host(s) tried for query failed (no host was tried) at
> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
> at
> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
> at
> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
> at
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
> at"*
>
> Using c*-2.0.17 , datastax java driver - cassandra-driver-core-2.1.8.jar.
>
> In cassandra.yaml following were set
> rpc_address: 0.0.0.0  , broadcast_rpc_address: 1.2.3.4
>
> Have anyone faced such issue ? What could be the reason and fix for it?
>
> Thanks in advance
>
>
> Techpyaasa.
>
>
>
>


Re: NoHostAvailableException

2016-11-21 Thread techpyaasa .
Hi Vladimir,

I have attached cassandra.yaml we have in our setup, please check once.
- do you have native port 9042 open in firewall ?
Yes, 9042 is opened on our firewall, checked with our team

- Can you connect to cluster with cqlsh?
Yes, Im able to connect cluster using cqlsh.

What else could be issue? :(



On Mon, Nov 21, 2016 at 7:23 PM, Vladimir Yudovin <vla...@winguzone.com>
wrote:

> Yaml in 2.0.17 says
>
> # The address to bind the Thrift RPC service and native transport
> # server -- clients connect here.
> #
> # Leaving this blank has the same effect it does for ListenAddress,
> # (i.e. it will be based on the configured hostname of the node).
> #
> # Note that unlike ListenAddress above, it is allowed to specify 0.0.0.0
> # here if you want to listen on all interfaces, but that will break
> clients
> # that rely on node auto-discovery.
> #
> # For security reasons, you should not expose this port to the internet.
> Firewall it if needed.
> rpc_address: localhost
> # port for Thrift to listen for clients on
> rpc_port: 9160
>
>
> So probably *rpc_address: 0.0.0.0* is a problem. Also do you have native
> port 9042 open in firewall (if there is one).
> Can you connect to cluster with cqlsh?
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Mon, 21 Nov 2016 08:26:54 -0500*techpyaasa .
> <techpya...@gmail.com <techpya...@gmail.com>>* wrote 
>
> Sorry it was typo..
> It is *broadcast_address and not **broadcast_**rpc**_address.*
> And also there is no such configuration in cass.yaml with 
> *broadcast_rpc_address
> *in c*-2.0.17.
> Very sorry once again.
>
> This is configrn I have in cass.yaml
>
>
> *listen_address: [external IP]*
> *# Address to broadcast to other Cassandra nodes*
> *# Leaving this blank will set it to the same value as listen_address*
>
>
>
> *# broadcast_address: 1.2.3.4 #It is commented, I have not made any
> changes for itrpc_address: 0.0.0.0*
> *rpc_port: 9160*
> Thanks
> TechPyaasa
>
>
>
> On Mon, Nov 21, 2016 at 6:48 PM, Vladimir Yudovin <vla...@winguzone.com>
> wrote:
>
>
> Not *broadcast_address*, but *broadcast_rpc_address*  (you gave this
> example:rpc_address: 0.0.0.0  , broadcast_rpc_address: 1.2.3.4)
>
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Mon, 21 Nov 2016 08:14:38 -0500*techpyaasa .
> <techpya...@gmail.com <techpya...@gmail.com>>* wrote 
>
> Hi Vladimir,
>
> I have not modified anything for broadcasr_address, I left as it was..
>
> *# Leaving this blank will set it to the same value as listen_address*
> *# broadcast_address: 1.2.3.4*
>
> So the  comment above says "*Leaving this blank will set it to the same
> value as listen_address" *, so it shud set as listen_address and I have
> set listen_address as its external IPs for all nodes..
> So I guess that should not be a problem... :(
>
> What else could be the issue...??  :( :(
>
> On Mon, Nov 21, 2016 at 4:21 PM, Vladimir Yudovin <vla...@winguzone.com>
> wrote:
>
>
> Try to set *broadcast_rpc_address* on each node to its real external IP
> address.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Mon, 21 Nov 2016 05:47:00 -0500*techpyaasa .
> <techpya...@gmail.com <techpya...@gmail.com>>* wrote 
>
> Following exception intermittently thrown by datastax java driver though
> all nodes are up.(Happening for both read & write queries)
>
> *"Exception com.datastax.driver.core.exceptions.NoHostAvailableException:
> All host(s) tried for query failed (no host was tried) at
> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
> at
> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
> at
> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
> at
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
> at"*
>
> Using c*-2.0.17 , datastax java driver - cassandra-driver-core-2.1.8.jar.
>
> In cassandra.yaml following were set
> rpc_address: 0.0.0.0  , broadcast_rpc_address: 1.2.3.4
>
> Have anyone faced such issue ? What could be the reason and fix for it?
>
> Thanks in advance
>
>
> Techpyaasa.
>
>
>
>
>


cassandra.yaml
Description: application/yaml


How to change Replication Strategy and RF

2016-12-29 Thread techpyaasa .
Hi all,

We have mistakenly setup c*-2.0.17 cluster (with 1 DC , 3 racks , 2 nodes
in each rack with SimpleStrategy & *RF=1)*.
Now data on each node is nearly 1.4 GB+ .

Now we would like to change Replication Strategy to NetworkTopologyStrategy
and RF=3 and also add a new Data Center to this cluster.

Can someone please suggest *safest* way to do so.

Thanks in advance,
Techpyaasa


Re: How to change Replication Strategy and RF

2016-12-30 Thread techpyaasa .
Thanks a lot kurt Greaves

On Fri, Dec 30, 2016 at 5:58 AM, kurt Greaves  wrote:

>
> ​If you're already using the cluster in production and require no downtime
> you should perform a datacenter migration first to change the RF to 3.
> Rough process would be as follows:
>
>1. Change keyspace to NetworkTopologyStrategy with RF=1. You shouldn't
>increase RF here as you will receive read failures as not all nodes have
>the data they own. You would have to wait for a repair to complete to stop
>any read failures.
>2. Configure your clients to use a LOCAL_* consistency and
>DCAwareRoundRobinPolicy for load balancing (with the current DC configured)
>3. Add a new datacenter, configure it's replication to be 3.
>4. Rebuild the new datacenter by running nodetool rebuild  on
>each node in the new DC.
>5. Migrate your clients to use the new datacenter, by switching the
>contact points to nodes in the new DC and the load balancing policy DC to
>the new DC
>6. At this point you could increase the replication factor on the old
>DC to 3, and then run a repair. Once the repair successfully completes you
>should have 2 DCs that you can use. If you need the DCs in separate
>locations you could change this step to adding another DC in the desired
>other location and running rebuilds as per steps 2-4.
>
> - Kurt
>


Re: c* updates not getting reflected.

2017-07-12 Thread techpyaasa .
Hi Carlos Rolo

Using LOCAL_QUORUM for both writes & reads.
I see there is a time difference of 2 mins among nodes, I think that could
be the reason.
Anyways thanks for replying Carlos Rolo...
Have a nice day... :)

On Wed, Jul 12, 2017 at 12:45 AM, Carlos Rolo <r...@pythian.com> wrote:

> What consistency are you using on those queries?
>
> On 11 Jul 2017 19:09, "techpyaasa ." <techpya...@gmail.com> wrote:
>
>> Hi,
>>
>> We have a table with following schema:
>>
>> CREATE TABLE ks1.cf1 ( pid bigint, cid bigint, resp_json text, status
>> int, PRIMARY KEY (pid, cid) ) WITH CLUSTERING ORDER BY (cid ASC) with LCS
>> compaction strategy.
>>
>> We make very frequent updates to this table with query like
>>
>> UPDATE ks1.cf1 SET status = 0 where pid=1 and cid=1;
>> UPDATE ks1.cf1 SET resp_json='' where uid=1 and mid=1;
>>
>>
>> Now we seeing a strange issue like sometimes status column or resp_json
>> column value not getting updated when we query using SELECT query.
>>
>> We are not seeing any exceptions though during UPDATE query executions.
>> And also is there any way to make sure that last UPDATE was success??
>>
>> We are using c* - 2.1.17 , datastax java driver 2.1.18.
>>
>> Can someone point out what the issue is or anybody faced such strange
>> issue?
>>
>> Any help is appreciated.
>>
>> Thanks in advance
>> TechPyaasa
>>
>
> --
>
>
>
>


c* updates not getting reflected.

2017-07-11 Thread techpyaasa .
Hi,

We have a table with following schema:

CREATE TABLE ks1.cf1 ( pid bigint, cid bigint, resp_json text, status int,
PRIMARY KEY (pid, cid) ) WITH CLUSTERING ORDER BY (cid ASC) with LCS
compaction strategy.

We make very frequent updates to this table with query like

UPDATE ks1.cf1 SET status = 0 where pid=1 and cid=1;
UPDATE ks1.cf1 SET resp_json='' where uid=1 and mid=1;


Now we seeing a strange issue like sometimes status column or resp_json
column value not getting updated when we query using SELECT query.

We are not seeing any exceptions though during UPDATE query executions.
And also is there any way to make sure that last UPDATE was success??

We are using c* - 2.1.17 , datastax java driver 2.1.18.

Can someone point out what the issue is or anybody faced such strange issue?

Any help is appreciated.

Thanks in advance
TechPyaasa


Help in c* Data modelling

2017-07-22 Thread techpyaasa .
Hi ,

We have a table like below :

CREATE TABLE ks.cf ( accountId bigint, pid bigint, dispName text, status
> int, PRIMARY KEY (accountId, pid) ) WITH CLUSTERING ORDER BY (pid ASC);



We would like to have following queries possible on the above table:

select * from site24x7.wm_current_status where uid=1 and mid=1;
select * from site24x7.wm_current_status where uid=1 order by dispName asc;
select * from site24x7.wm_current_status where uid=1 and status=0 order by
dispName asc;

I know first query is possible by default , but I want the last 2 queries
also to work.

So can some one please let me know how can I achieve the same in
cassandra(c*-2.1.17). I'm ok with applying indexes etc,

Thanks
TechPyaasa


Re: Help in c* Data modelling

2017-07-23 Thread techpyaasa .
Hi Varun,

Thanks a lot for your reply.

In this case if I want to update status(status can be updated for given
account_id, pid) , I need to delete existing row in 2nd table & add new
one...  :( :(

Its like hitting cassandra twice for 1 change.. :(



On Sun, Jul 23, 2017 at 8:42 PM, Varun Barala <varunbaral...@gmail.com>
wrote:

> Hi,
>
> You can create pseudo index table.
>
> IMO, structure can be:-
>
>
> CREATE TABLE IF NOT EXISTS test.user ( account_id bigint, pid bigint, 
> disp_name text, status int, PRIMARY KEY (account_id, pid) ) WITH CLUSTERING 
> ORDER BY (pid ASC);
> CREATE TABLE IF NOT EXISTS test.user_index ( account_id bigint, pid bigint, 
> disp_name text, status int, PRIMARY KEY ((account_id, status), disp_name) ) 
> WITH CLUSTERING ORDER BY (disp_name ASC);
>
>
> to support query *:-  select * from site24x7.wm_current_status where
> uid=1 order by dispName asc;*
> You can use *in condition* on last partition key *status *in table
>
> *test.user_index.*
>
>
>
> *It depends on your use case and amount of data as well. It can be
> optimized more...*
> Thanks!!
>
> On Sun, Jul 23, 2017 at 2:48 AM, techpyaasa . <techpya...@gmail.com>
> wrote:
>
>> Hi ,
>>
>> We have a table like below :
>>
>> CREATE TABLE ks.cf ( accountId bigint, pid bigint, dispName text, status
>>> int, PRIMARY KEY (accountId, pid) ) WITH CLUSTERING ORDER BY (pid ASC);
>>
>>
>>
>> We would like to have following queries possible on the above table:
>>
>> select * from site24x7.wm_current_status where uid=1 and mid=1;
>> select * from site24x7.wm_current_status where uid=1 order by dispName
>> asc;
>> select * from site24x7.wm_current_status where uid=1 and status=0 order
>> by dispName asc;
>>
>> I know first query is possible by default , but I want the last 2 queries
>> also to work.
>>
>> So can some one please let me know how can I achieve the same in
>> cassandra(c*-2.1.17). I'm ok with applying indexes etc,
>>
>> Thanks
>> TechPyaasa
>>
>
>


Re: Help in c* Data modelling

2017-07-23 Thread techpyaasa .
Hi vladyu/varunbarala

Instead of creating second table as you said can I just have one(first)
table below and get all rows with status=0.

CREATE TABLE IF NOT EXISTS test.user ( account_id bigint, pid bigint,
disp_name text, status int, PRIMARY KEY (account_id, pid) ) WITH
CLUSTERING ORDER BY (pid ASC);
>

I mean get all rows within same partition(account_id) whose
status=0(say some value) using *UDF/UDA* in c* ?

>
> select group_by_status from test.user;


where group_by_status is UDA/UDF


Thanks in advance
TechPyaasa


On Sun, Jul 23, 2017 at 10:42 PM, Vladimir Yudovin <vla...@winguzone.com>
wrote:

> Hi,
>
> unfortunately ORDER BY is supported for clustering columns only...
>
> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>
>
>  On Sun, 23 Jul 2017 12:49:36 -0400 *techpyaasa .
> <techpya...@gmail.com <techpya...@gmail.com>>* wrote 
>
> Hi Varun,
>
> Thanks a lot for your reply.
>
> In this case if I want to update status(status can be updated for given
> account_id, pid) , I need to delete existing row in 2nd table & add new
> one...  :( :(
>
> Its like hitting cassandra twice for 1 change.. :(
>
>
>
> On Sun, Jul 23, 2017 at 8:42 PM, Varun Barala <varunbaral...@gmail.com>
> wrote:
>
> Hi,
> You can create pseudo index table.
>
> IMO, structure can be:-
>
>
> CREATE TABLE IF NOT EXISTS test.user ( account_id bigint, pid bigint, 
> disp_name text, status int, PRIMARY KEY (account_id, pid) ) WITH CLUSTERING 
> ORDER BY (pid ASC);
> CREATE TABLE IF NOT EXISTS test.user_index ( account_id bigint, pid bigint, 
> disp_name text, status int, PRIMARY KEY ((account_id, status), disp_name) ) 
> WITH CLUSTERING ORDER BY (disp_name ASC);
>
> to support query *:-  select * from site24x7.wm_current_status where
> uid=1 order by dispName asc;*
> You can use *in condition* on last partition key *status *in table
> *test.user_index.*
>
>
> *It depends on your use case and amount of data as well. It can be
> optimized more...*
> Thanks!!
>
> On Sun, Jul 23, 2017 at 2:48 AM, techpyaasa . <techpya...@gmail.com>
> wrote:
>
> Hi ,
>
> We have a table like below :
>
> CREATE TABLE ks.cf ( accountId bigint, pid bigint, dispName text, status
> int, PRIMARY KEY (accountId, pid) ) WITH CLUSTERING ORDER BY (pid ASC);
>
>
>
> We would like to have following queries possible on the above table:
>
> select * from site24x7.wm_current_status where uid=1 and mid=1;
> select * from site24x7.wm_current_status where uid=1 order by dispName asc;
> select * from site24x7.wm_current_status where uid=1 and status=0 order by
> dispName asc;
>
> I know first query is possible by default , but I want the last 2 queries
> also to work.
>
> So can some one please let me know how can I achieve the same in
> cassandra(c*-2.1.17). I'm ok with applying indexes etc,
>
> Thanks
> TechPyaasa
>
>
>


How to have nested collections in UDF/UDA

2017-07-27 Thread techpyaasa .
Hi ,

I have a tables as below:

CREATE TABLE test.cs (
> pid bigint,
> cid bigint,
> stat_object text,
> status int,
> PRIMARY KEY (pid, cid)
> ) WITH CLUSTERING ORDER BY (cid ASC)


How can I have a function like below :

CREATE or REPLACE FUNCTION test.countstatusobjs (state map<int,
>  map<bigint,text>> , status int , cid bigint , stat_object text)
> RETURNS NULL ON NULL INPUT RETURNS map<int,int> LANGUAGE java AS 'if
> (state.containsKey(status)) {
> Map<Long,String> mm = (Map) state.get(status);
> mm.put(cid, text);
> state.put(status , mm);
> }else {
> Map<Long,String> mm = new HashMap<Long,String>();
> mm.put(cid, text);
> state.put(status , mm);
> }
> return state;';


To have map of cid,stat_object for a given status.


I was succeed in getting count of each status using below funcitons:

CREATE or REPLACE FUNCTION test.countstatus (state map<int,int> , status
> int)
> RETURNS NULL ON NULL INPUT RETURNS map<int,int> LANGUAGE java AS 'if
> (state.containsKey(status)) {
> state.put(status , (Integer) state.get(status) + 1);
> }else {
> state.put(status , 1);
> }
> return state;';


CREATE or REPLACE AGGREGATE groupbystatus (int)
> SFUNC countstatus STYPE map<int,int>
> INITCOND{};


select groupbystatus(status) from test.cs where pid=1;
>  test.groupbystatus(status)
> ----
>{0: 2, 1: 3, 2: 1, 5: 1}



In the same way I want to achieve Map> .
How can I do the same  ?
Using c*-2.2.8

Thanks in advance
TechPyaasa


Re: UDF for sorting

2017-07-04 Thread techpyaasa .
Hi Justin,

Thanks for the reply.
We are using c*-2.1.17 , does lucene plugin works with this version??

On Tue, Jul 4, 2017 at 4:49 AM, Justin Cameron <jus...@instaclustr.com>
wrote:

> While you can't do this with Cassandra, you can get the functionality you
> want with the cassandra-lucene-plugin (https://github.com/Stratio/
> cassandra-lucene-index/blob/branch-3.0.10/doc/documentation.rst#searching
> ).
>
> Keep in mind that as with any secondary index there are
> performance-related limitations: https://github.
> com/Stratio/cassandra-lucene-index/blob/branch-3.0.10/doc/
> documentation.rst#performance-tips
>
> On Tue, 4 Jul 2017 at 07:17 DuyHai Doan <doanduy...@gmail.com> wrote:
>
>> Plain answer is no you can't
>>
>> The reason is that UDF only transform column values on each row but does
>> not have the ability to modify rows ordering
>>
>> On Mon, Jul 3, 2017 at 10:14 PM, techpyaasa . <techpya...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I have a table like
>>>
>>> CREATE TABLE ks.cf ( pk1 bigint, cc1 bigint, disp_name text , stat_obj
>>> text, status int, PRIMARY KEY (pk1, cc1)) WITH CLUSTERING ORDER BY (cc1 ASC)
>>>
>>> CREATE INDEX idx1 on ks.cf(status);
>>>
>>> I want to have a queries like
>>> *select * from ks.cf <http://ks.cf> where pk1=123 and cc1=345;*
>>>
>>> and
>>> *select * from ks.cf <http://ks.cf> where pk1=123 and status=1;*
>>> In this case , I want rows to be sorted based on 'disp_name' (asc/desc) .
>>>
>>> Can I achieve the same using UDF or anything else ?? (Sorry If my
>>> understanding about UDF is wrong).
>>>
>>> Thanks in advance
>>> TechPyaasa
>>>
>>
>> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>


UDF for sorting

2017-07-03 Thread techpyaasa .
Hi all,

I have a table like

CREATE TABLE ks.cf ( pk1 bigint, cc1 bigint, disp_name text , stat_obj
text, status int, PRIMARY KEY (pk1, cc1)) WITH CLUSTERING ORDER BY (cc1 ASC)

CREATE INDEX idx1 on ks.cf(status);

I want to have a queries like
*select * from ks.cf <http://ks.cf> where pk1=123 and cc1=345;*

and
*select * from ks.cf <http://ks.cf> where pk1=123 and status=1;*
In this case , I want rows to be sorted based on 'disp_name' (asc/desc) .

Can I achieve the same using UDF or anything else ?? (Sorry If my
understanding about UDF is wrong).

Thanks in advance
TechPyaasa


Re: Limit on having number of nodes in C* cluster

2017-08-22 Thread techpyaasa .
How can I decrease tokens for existing nodes?
Doesn't it create problem?


On Aug 22, 2017 7:22 PM, "Vladimir Yudovin" <vla...@winguzone.com> wrote:

Probably decreasing tokens number
<https://issues.apache.org/jira/browse/CASSANDRA-7032> can help to mange
big cluster?

Best regards, Vladimir Yudovin,
*Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*


 On Mon, 21 Aug 2017 19:38:37 -0400 *Eduard Tudenhoefner
<eduard.tudenhoef...@datastax.com <eduard.tudenhoef...@datastax.com>>*
wrote 

We've been doing successful testing with multi-DC setups and 500 nodes per
DC. However, I agree with Jon here. Certain things are easier/faster with
e.g. 5x100 node clusters than 1x500 node cluster.

Cheers

On Mon, Aug 21, 2017 at 10:16 AM, Jon Haddad <jonathan.had...@gmail.com>
wrote:

As far as I know, those 75K nodes are not in a single cluster.  If memory
serves correctly (and this article seems to indicate that it does
http://www.techrepublic.com/article/apples-secret-
nosql-sauce-includes-a-hefty-dose-of-cassandra/), you’ll see clusters of
1,000 nodes.

Things start to get a little hairy once you go above a couple hundred
nodes.  I would rather run 5 100 node clusters than a single 500 node
cluster.  In theory, once you’ve built out the tooling to manage 2 clusters
you should be able to apply it to manage 20 (reality always gets in the way
though…)

Jon

On Aug 21, 2017, at 9:15 AM, techpyaasa . <techpya...@gmail.com> wrote:

Thanks lot for reply :)

On Aug 21, 2017 6:44 PM, "Vladimir Yudovin" <vla...@winguzone.com> wrote:


Actually there are clusters of thousandths nodes: Some of the largest
production deployments include Apple's, with over 75,000 nodes storing over
10 PB of data <http://cassandra.apache.org/>

Best regards, Vladimir Yudovin,
*Winguzone <https://winguzone.com/?from=list> - Cloud Cassandra Hosting*


 On Mon, 21 Aug 2017 08:35:37 -0400 *techpyaasa . <techpya...@gmail.com
<techpya...@gmail.com>>* wrote 

Hi

Is there any limit on having number of nodes in c* cluster.
Right now we have c*-2.1.17 cluster with 3 DCs each DC with 3 groups & each
group has 21 nodes.

We wanted to increase the cluster capacity by adding 6 nodes per group as
many of nodes disk usage crossed 65%.

So just wanted to clarify is there any limit/drawback having huge
cluster/too many nodes in a c* cluster

Thanks in advance
TechPyaasa


Huge Batches

2017-06-08 Thread techpyaasa .
Hi ,

Recently we are seeing huge batches and log prints as below in c* logs


*Batch of prepared statements for [ks1.cf1] is of size 413350, exceeding
specified threshold of 5120 by 362150*
Along with the Column Family name (as found in above log print) , we would
like to know the partion key , cluster column values(along with their
names) too , so that it would be easy to trace out the user who is
inserting such huge batches.

I tried to see code base of c* as below, but could not figure out how to
get values of partition keys , values of cluster columns. :(
Can some one please help me out...

   * public static void verifyBatchSize(Iterable cfs)*
*{*
*long size = 0;*
*long warnThreshold =
DatabaseDescriptor.getBatchSizeWarnThreshold();*

*for (ColumnFamily cf : cfs)*
*size += cf.dataSize();*

*if (size > warnThreshold)*
*{*
*Set ksCfPairs = new HashSet<>();*
*for (ColumnFamily cf : cfs)*
*{*
*ksCfPairs.add(String.format("%s.%s size=%s",
cf.metadata().ksName, cf.metadata().cfName , cf.dataSize()));*
*Iterator cns = cf.getColumnNames().iterator();*
*CellName cn = cns.next();*
*cn.dataSize();*
*}*

*String format = "Batch of prepared statements for {} is of
size {}, exceeding specified threshold of {} by {}.";*
*logger.warn(format, ksCfPairs, size, warnThreshold, size -
warnThreshold);*
*}*
*}*


Thanks
TechPyaasa


Re: Huge Batches

2017-06-09 Thread techpyaasa .
Hi Justin,

We have very few columns in PK(max 2 partition columns , max 2 clustering
columns) and it wont have huge data/huge number of primary keys.
I just wanted to print the names & values of these columns for huge batches.

PS: we are using c*-2.1

Thanks for reply @Justin and @Akhil

On Fri, Jun 9, 2017 at 5:31 AM, Justin Cameron <jus...@instaclustr.com>
wrote:

> I don't believe the keys within a large batch are logged by Cassandra. A
> large batch could potentially contain tens of thousands of primary keys, so
> this could quickly fill up the logs.
>
> Here are a couple of suggestions:
>
>- Large batches should also be slow, so you could try setting up slow
>query logging in the Java driver and see what gets caught:
>https://docs.datastax.com/en/developer/java-driver/3.2/manual/logging/
><https://docs.datastax.com/en/developer/java-driver/3.2/manual/logging/>
>- You could write your own custom QueryHandler to log those details on
>the server-side, as described here: https://www.slideshare.
>net/planetcassandra/cassandra-summit-2014-lesser-known-
>features-of-cassandra-21
>
> <https://www.slideshare.net/planetcassandra/cassandra-summit-2014-lesser-known-features-of-cassandra-21>
>
>
> Cheers,
> Justin
>
> On Thu, 8 Jun 2017 at 18:49 techpyaasa . <techpya...@gmail.com> wrote:
>
>> Hi ,
>>
>> Recently we are seeing huge batches and log prints as below in c* logs
>>
>>
>> *Batch of prepared statements for [ks1.cf1] is of size 413350, exceeding
>> specified threshold of 5120 by 362150*
>> Along with the Column Family name (as found in above log print) , we
>> would like to know the partion key , cluster column values(along with their
>> names) too , so that it would be easy to trace out the user who is
>> inserting such huge batches.
>>
>> I tried to see code base of c* as below, but could not figure out how to
>> get values of partition keys , values of cluster columns. :(
>> Can some one please help me out...
>>
>>* public static void verifyBatchSize(Iterable cfs)*
>> *{*
>> *long size = 0;*
>> *long warnThreshold =
>> DatabaseDescriptor.getBatchSizeWarnThreshold();*
>>
>> *for (ColumnFamily cf : cfs)*
>> *size += cf.dataSize();*
>>
>> *if (size > warnThreshold)*
>> *{*
>> *Set ksCfPairs = new HashSet<>();*
>> *for (ColumnFamily cf : cfs)*
>> *{*
>> *ksCfPairs.add(String.format("%s.%s size=%s",
>> cf.metadata().ksName, cf.metadata().cfName , cf.dataSize()));*
>> *Iterator cns = cf.getColumnNames().iterator();*
>> *CellName cn = cns.next();*
>> *    cn.dataSize();*
>> *}*
>>
>> *String format = "Batch of prepared statements for {} is of
>> size {}, exceeding specified threshold of {} by {}.";*
>> *logger.warn(format, ksCfPairs, size, warnThreshold, size -
>> warnThreshold);*
>> *}*
>> *}*
>>
>>
>> Thanks
>>
>> TechPyaasa
>>
> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>


Re: Secondary Index

2017-06-20 Thread techpyaasa .
Hi ZAIDI,

Thanks for reply.
Sorry I didn't get your line
"You can get away the potential situation by leveraging composite key, if
that is possible for you?"

How can I get through it??

Like I have a table as below
CREATE TABLE ks1.cf1 (id1 bigint, id2 bigint, resp text, status int,
PRIMARY KEY (id1, id2)
) WITH CLUSTERING ORDER BY (id2 ASC)

'status' will have values of 0/1/2/3/4 (4 possible values) , insertions to
table(partition) will happen based on id2 i.e values(id1,id2,resp,status)

I want to have a filtering/criteria applied on 'status' column too like
select * from ks1.cf1 where id1=123 and status=0;

How can I achieve this w/o secondary index (on 'status' column )??


On Tue, Jun 20, 2017 at 12:09 AM, ZAIDI, ASAD A <az1...@att.com> wrote:

> If you’re only creating index so that your query work, think again!
> You’ll be storing secondary index on each node , queries involving index
> could create issues (slowness!!) down the road the when index on multiple
> node Is involved and  not maintained!  Tables involving a lot of
> inserts/delete could easily ruin index performance.
>
>
>
> You can get away the potential situation by leveraging composite key, if
> that is possible for you?
>
>
>
>
>
> *From:* techpyaasa . [mailto:techpya...@gmail.com]
> *Sent:* Monday, June 19, 2017 1:01 PM
> *To:* user@cassandra.apache.org
> *Subject:* Secondary Index
>
>
>
> Hi,
>
> I want to create Index on already existing table which has more than 3
> GB/node.
> We are using c*-2.1.17 with 2 DCs , each DC with 3 groups and each group
> has 7 nodes.(Total 42 nodes in cluster)
>
> So is it ok to create Index on this table now or will it have any problem?
> If its ok , how much time it would take for this process?
>
>
> Thanks in advance,
> TechPyaasa
>


Re: Secondary Index

2017-06-25 Thread techpyaasa .
Thanks for the reply.

I just have one more doubt, please do clarify this.

Will be there any performance difference between these 2 queries for the
above table.

1. select * from ks1.cf1 where status=1;
2. select * from ks1.cf1 where id1=123456 and status=1;

where id1 is partition key and status is indexed column as I said above.

Could you please tell me the performance difference btwn above 2 queries.

Thanks in advance,

Techpyaasaa

On Tue, Jun 20, 2017 at 9:03 PM, ZAIDI, ASAD A <az1...@att.com> wrote:

> Hey there –
>
>
>
> Like other suggested before adding more index , look for opportunity to
> de-normalize your data model OR create composite keys for your primary
> index – if that works for you.
>
> Secondary index are there so you can leverage them they come with cost.
> They’re difficult to manage , as you repair data  your secondary index will
> NOT be automatically repaired so you’ll need to maintain them
>
> On each cluster node. Depending on size of your cluster that could be a
> significant effort. Be prepared to rebuild your new index (nodetool
> rebuild_index) as often as you change the data . performance will
> eventually get a hit cause index rebuilding is expensive operation on CPU
> ..
>
>
>
> See please http://docs.datastax.com/en/cql/3.1/cql/ddl/ddl_when_use_
> index_c.html
>
>
>
>
>
>
>
> *From:* techpyaasa . [mailto:techpya...@gmail.com]
> *Sent:* Tuesday, June 20, 2017 2:30 AM
> *To:* ZAIDI, ASAD A <az1...@att.com>
> *Cc:* user@cassandra.apache.org
> *Subject:* Re: Secondary Index
>
>
>
> Hi ZAIDI,
>
> Thanks for reply.
> Sorry I didn't get your line
> "You can get away the potential situation by leveraging composite key, if
> that is possible for you?"
>
> How can I get through it??
>
> Like I have a table as below
>
> CREATE TABLE ks1.cf1 (id1 bigint, id2 bigint, resp text, status int,
> PRIMARY KEY (id1, id2)
>
> ) WITH CLUSTERING ORDER BY (id2 ASC)
>
>
> 'status' will have values of 0/1/2/3/4 (4 possible values) , insertions to
> table(partition) will happen based on id2 i.e values(id1,id2,resp,status)
>
> I want to have a filtering/criteria applied on 'status' column too like
> select * from ks1.cf1 where id1=123 and status=0;
>
> How can I achieve this w/o secondary index (on 'status' column )??
>
>
>
> On Tue, Jun 20, 2017 at 12:09 AM, ZAIDI, ASAD A <az1...@att.com> wrote:
>
> If you’re only creating index so that your query work, think again!
> You’ll be storing secondary index on each node , queries involving index
> could create issues (slowness!!) down the road the when index on multiple
> node Is involved and  not maintained!  Tables involving a lot of
> inserts/delete could easily ruin index performance.
>
>
>
> You can get away the potential situation by leveraging composite key, if
> that is possible for you?
>
>
>
>
>
> *From:* techpyaasa . [mailto:techpya...@gmail.com]
> *Sent:* Monday, June 19, 2017 1:01 PM
> *To:* user@cassandra.apache.org
> *Subject:* Secondary Index
>
>
>
> Hi,
>
> I want to create Index on already existing table which has more than 3
> GB/node.
> We are using c*-2.1.17 with 2 DCs , each DC with 3 groups and each group
> has 7 nodes.(Total 42 nodes in cluster)
>
> So is it ok to create Index on this table now or will it have any problem?
> If its ok , how much time it would take for this process?
>
>
> Thanks in advance,
> TechPyaasa
>
>
>


Best practice to add(bootstrap) multiple nodes to cluster at once

2017-06-20 Thread techpyaasa .
Hi,

What is the best practice to add(bootstrap) multiple nodes at once to c*
cluster.

Using c*-2.1.17 , 2 DCs , 3 groups in each DC

Thanks
TechPyaasa


Secondary Index

2017-06-19 Thread techpyaasa .
Hi,

I want to create Index on already existing table which has more than 3
GB/node.
We are using c*-2.1.17 with 2 DCs , each DC with 3 groups and each group
has 7 nodes.(Total 42 nodes in cluster)

So is it ok to create Index on this table now or will it have any problem?
If its ok , how much time it would take for this process?


Thanks in advance,
TechPyaasa


How to find dataSize at client side?

2017-05-23 Thread techpyaasa .
* WARN [SharedPool-Worker-1] 2017-05-22 20:28:46,204 BatchStatement.java
(line 253) Batch of prepared statements for [site24x7.wm_rawstats_tb,
site24x7.wm_rawstats] is of size 6122, exceeding specified threshold of
5120 by 1002*
We are frequently getting this message in logs, so I wanted to restrict
inserts at client side by calculating *dataSize* of insert/batch statements
before sending it to c* servers.

We are using datastax java drivers , how can I get dataSize here??


Any ideas??

Thanks in advance
TechPyaasa


Re: How to find dataSize at client side?

2017-05-24 Thread techpyaasa .
Hi Nicolas

I think only DataStax Enterprise(paid) c* version can ask questions/get
support from datastax :(

On Tue, May 23, 2017 at 9:44 PM, techpyaasa . <techpya...@gmail.com> wrote:

> Thanks for your reply..
>
> On Tue, May 23, 2017 at 7:40 PM, Nicolas Guyomar <
> nicolas.guyo...@gmail.com> wrote:
>
>> Hi,
>>
>> If you were to know the batch size on client side to make sure it does
>> not get above the 5kb limit, so that you can "limit the number of
>> statements in a batch", I would suspect you do not need batch at all right
>> ? See  https://inoio.de/blog/2016/01/13/cassandra-to-batch-or-not-
>> to-batch/
>>
>> As for your question, you might get an answer on the java driver ML :
>> java-driver-u...@lists.datastax.com
>>
>>
>> On 23 May 2017 at 15:25, techpyaasa . <techpya...@gmail.com> wrote:
>>
>>>
>>> * WARN [SharedPool-Worker-1] 2017-05-22 20:28:46,204 BatchStatement.java
>>> (line 253) Batch of prepared statements for [site24x7.wm_rawstats_tb,
>>> site24x7.wm_rawstats] is of size 6122, exceeding specified threshold of
>>> 5120 by 1002*
>>> We are frequently getting this message in logs, so I wanted to restrict
>>> inserts at client side by calculating *dataSize* of insert/batch
>>> statements before sending it to c* servers.
>>>
>>> We are using datastax java drivers , how can I get dataSize here??
>>>
>>>
>>> Any ideas??
>>>
>>> Thanks in advance
>>> TechPyaasa
>>>
>>
>>
>


Re: Slow performance after upgrading from 2.0.9 to 2.1.11

2017-05-04 Thread techpyaasa .
Hi guys,

Has anybody got fix for this issue?
Recently we upgraded our c* cluster from 2.0.17 to 2.1.17. And we saw
increase in read latency on few tables, read latency increased almost 2 to
3 times more than what it was in 2.0.17.

Kindly let me know the fix for it if anybody knows it.

Thanks
TechPyaasa

On Wed, Nov 9, 2016 at 1:28 AM, Dikang Gu <dikan...@gmail.com> wrote:

> Michael, thanks for the info. It sounds to me a very serious performance
> regression. :(
>
> On Tue, Nov 8, 2016 at 11:39 AM, Michael Kjellman <
> mkjell...@internalcircle.com> wrote:
>
>> Yes, We hit this as well. We have a internal patch that I wrote to mostly
>> revert the behavior back to ByteBuffers with as small amount of code change
>> as possible. Performance of our build is now even with 2.0.x and we've also
>> forward ported it to 3.x (although the 3.x patch was even more complicated
>> due to Bounds, RangeTombstoneBound, ClusteringPrefix which actually
>> increases the number of allocations to somewhere between 11 and 13
>> depending on how I count it per indexed block -- making it even worse than
>> what you're observing in 2.1.
>>
>> We haven't upstreamed it as 2.1 is obviously not taking any changes at
>> this point and the longer term solution is https://issues.apache.org/jira
>> /browse/CASSANDRA-9754 (which also includes the changes to go back to
>> ByteBuffers and remove as much of the Composites from the storage engine as
>> possible.) Also, the solution is a bit of a hack -- although it was a
>> blocker from us deploying 2.1 -- so i'm not sure how "hacky" it is if it
>> works..
>>
>> best,
>> kjellman
>>
>>
>> On Nov 8, 2016, at 11:31 AM, Dikang Gu <dikan...@gmail.com<mailto:dik
>> an...@gmail.com>> wrote:
>>
>> This is very expensive:
>>
>> "MessagingService-Incoming-/2401:db00:21:1029:face:0:9:0" prio=10
>> tid=0x7f2fd57e1800 nid=0x1cc510 runnable [0x7f2b971b]
>>java.lang.Thread.State: RUNNABLE
>> at org.apache.cassandra.db.marshal.IntegerType.compare(IntegerT
>> ype.java:29)
>> at org.apache.cassandra.db.composites.AbstractSimpleCellNameTyp
>> e.compare(AbstractSimpleCellNameType.java:98)
>> at org.apache.cassandra.db.composites.AbstractSimpleCellNameTyp
>> e.compare(AbstractSimpleCellNameType.java:31)
>> at java.util.TreeMap.put(TreeMap.java:545)
>> at java.util.TreeSet.add(TreeSet.java:255)
>> at org.apache.cassandra.db.filter.NamesQueryFilter$Serializer.
>> deserialize(NamesQueryFilter.java:254)
>> at org.apache.cassandra.db.filter.NamesQueryFilter$Serializer.
>> deserialize(NamesQueryFilter.java:228)
>> at org.apache.cassandra.db.SliceByNamesReadCommandSerializer.
>> deserialize(SliceByNamesReadCommand.java:104)
>> at org.apache.cassandra.db.ReadCommandSerializer.deserialize(
>> ReadCommand.java:156)
>> at org.apache.cassandra.db.ReadCommandSerializer.deserialize(
>> ReadCommand.java:132)
>> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
>> at org.apache.cassandra.net.IncomingTcpConnection.receiveMessag
>> e(IncomingTcpConnection.java:195)
>> at org.apache.cassandra.net.IncomingTcpConnection.receiveMessag
>> es(IncomingTcpConnection.java:172)
>> at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingT
>> cpConnection.java:88)
>>
>>
>> Checked the git history, it comes from this jira:
>> https://issues.apache.org/jira/browse/CASSANDRA-5417
>>
>> Any thoughts?
>> ​
>>
>> On Fri, Oct 28, 2016 at 10:32 AM, Paulo Motta <pauloricard...@gmail.com
>> <mailto:pauloricard...@gmail.com>> wrote:
>> Haven't seen this before, but perhaps it's related to CASSANDRA-10433?
>> This is just a wild guess as it's in a related codepath, but maybe worth
>> trying out the patch available to see if it helps anything...
>>
>> 2016-10-28 15:03 GMT-02:00 Dikang Gu <dikan...@gmail.com<mailto:dik
>> an...@gmail.com>>:
>> We are seeing huge cpu regression when upgrading one of our 2.0.16
>> cluster to 2.1.14 as well. The 2.1.14 node is not able to handle the same
>> amount of read traffic as the 2.0.16 node, actually, it's less than 50%.
>>
>> And in the perf results, the first line could go as high as 50%, as we
>> turn up the read traffic, which never appeared in 2.0.16.
>>
>> Any thoughts?
>> Thanks
>>
>>
>> Samples: 952K of event 'cycles', Event count (approx.): 229681774560
>> Overhead  Shared

Limit on number of connections to Cassandra

2017-09-08 Thread techpyaasa .
Hi

Is there any limit on number of client connections to Cassandra just like
MySQL etc., ?

If YES, what is that & how can we set that?

If NO , how will get to know that node has reached it's capacity serving
client requests/over loaded?

Using C*-2.1.17 , datastax java driver


Thanks
Techpyaasa


Limit on having number of nodes in C* cluster

2017-08-21 Thread techpyaasa .
Hi

Is there any limit on having number of nodes in c* cluster.
Right now we have c*-2.1.17 cluster with 3 DCs each DC with 3 groups & each
group has 21 nodes.

We wanted to increase the cluster capacity by adding 6 nodes per group as
many of nodes disk usage crossed 65%.

So just wanted to clarify is there any limit/drawback having huge
cluster/too many nodes in a c* cluster

Thanks in advance
TechPyaasa


Re: Limit on having number of nodes in C* cluster

2017-08-21 Thread techpyaasa .
Thanks lot for reply :)

On Aug 21, 2017 6:44 PM, "Vladimir Yudovin" <vla...@winguzone.com> wrote:

> Actually there are clusters of thousandths nodes: Some of the largest
> production deployments include Apple's, with over 75,000 nodes storing over
> 10 PB of data <http://cassandra.apache.org/>
>
> Best regards, Vladimir Yudovin,
> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>
>
>  On Mon, 21 Aug 2017 08:35:37 -0400 *techpyaasa .
> <techpya...@gmail.com <techpya...@gmail.com>>* wrote 
>
> Hi
>
> Is there any limit on having number of nodes in c* cluster.
> Right now we have c*-2.1.17 cluster with 3 DCs each DC with 3 groups &
> each group has 21 nodes.
>
> We wanted to increase the cluster capacity by adding 6 nodes per group as
> many of nodes disk usage crossed 65%.
>
> So just wanted to clarify is there any limit/drawback having huge
> cluster/too many nodes in a c* cluster
>
> Thanks in advance
> TechPyaasa
>
>
>


DigestMismatchException after upgrade from c*-2.1.17 to c*-3.0.15

2018-04-17 Thread techpyaasa
Hi,

We have recently upgraded our cassandra production cluster(2 datacenters ,
each with 6 nodes, 3 groups) from c*-2.1.17 to c*-3.0.15.

After which we are getting too many exceptions as below.

org.apache.cassandra.service.DigestMismatchException: Mismatch for key
> DecoratedKey(-1032881015386111041, 03c099b9959871a9)
> (a613f5fd9fc797b252e26fe9b9b1ed4e vs 15b7d82a9b454f5fd433317f68de435f) at
> org.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:92)
> at
> org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
> at java.lang.Thread.run(Thread.java:745)
>

No hints present /no mutation dropped, but still the above exception is
thrown quite frequently.

Could someone help us out in finding out the root cause.

Thanks in advance
TechPyaasa