Re: EXT: Cassandra Monitoring tool

2018-05-25 Thread Harikrishnan Pillai
I assume you are using open source cassandra and you can look at Prometheus 
grafana for cassandra monitoring and lot information available in internet 
regarding how to setup the Prometheus monitoring for cassandra .


Sent from my iPhone

> On May 25, 2018, at 9:23 AM, ANEESH KUMAR K.M  wrote:
> 
> Please suggest me some good cluster monitoring tool for cassandra multi 
> region cluster.
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Re: Tuning bootstrap new node

2017-10-31 Thread Harikrishnan Pillai
There is no magic in speeding up the node addition other than increasing stream 
throughput and compaction throughput.

it has been noticed that with heavy compactions the latency may go up if the 
node also start serving data.

if you really don't want  this node to service traffic till all compactions 
settle down, you can disable gossip and binary protocol using the nodetool 
command. This will allow compactions to continue but requires a repair to fix 
the stale data later.

Regards

Hari



From: Nitan Kainth 
Sent: Tuesday, October 31, 2017 5:47 AM
To: user@cassandra.apache.org
Subject: EXT: Re: Tuning bootstrap new node

Do not stop compaction, you will end up with thousands of sstables.

You increase stream throughput from default 200 to a heifer value if your 
network can handle it.

Sent from my iPhone

On Oct 31, 2017, at 6:35 AM, Peng Xiao <2535...@qq.com> 
wrote:

Can we stop the compaction during the new node bootstraping and enable it after 
the new node joined?

Thanks
-- Original --
From:  "我自己的邮箱";<2535...@qq.com>;
Date:  Tue, Oct 31, 2017 07:18 PM
To:  "user">;
Subject:  Tuning bootstrap new node

Dear All,

Can we make some tuning to make bootstrap new node more quick?We have a three 
DC cluster(RF=3 in two DCs,RF=1 in another ,48 nodes in the DC with RF=3).As 
the Cluster is becoming larger and larger,we need to spend more than 24 hours 
to bootstrap a new node.
Could you please advise how to tune this ?

Many Thanks,
Peng Xiao


Re: Moving from DSE to Cassandra.

2017-07-18 Thread Harikrishnan Pillai
yes.



From: Harikrishnan Pillai <hpil...@walmartlabs.com>
Sent: Tuesday, July 18, 2017 12:30 PM
To: user@cassandra.apache.org
Subject: EXT: Re: Moving from DSE to Cassandra.


Moving away from DSE to apache cassandra is easy.

If there are any keyspace with "EverywhereStrategy" strategies, change them to 
"NetworkTopologyStrategy" first.

particularly DSE system. Start in opensource  mode and run sstable upgrade if 
needed.

Regards

Hari



From: Pranay akula <pranay.akula2...@gmail.com>
Sent: Tuesday, July 18, 2017 12:24 PM
To: user@cassandra.apache.org
Subject: EXT: Moving from DSE to Cassandra.

Planning to move from DSE to open source cassandra.

What will be better way to do it.

1) I am thinking of introducing new datacenter  with apache cassandra and 
replicating to that DC and after data replication remove DSE cassandra DC
2) Upgrading from current DSE version to later version of cassandra possibly 
2.2 or 3.x

Has anyone did it before any issues faced ??







Thanks
Pranay.


Re: Moving from DSE to Cassandra.

2017-07-18 Thread Harikrishnan Pillai
Moving away from DSE to apache cassandra is easy.

If there are any keyspace with "EverywhereStrategy" strategies, change them to 
"NetworkTopologyStrategy" first.

particularly DSE system. Start in opensource  mode and run sstable upgrade if 
needed.

Regards

Hari



From: Pranay akula 
Sent: Tuesday, July 18, 2017 12:24 PM
To: user@cassandra.apache.org
Subject: EXT: Moving from DSE to Cassandra.

Planning to move from DSE to open source cassandra.

What will be better way to do it.

1) I am thinking of introducing new datacenter  with apache cassandra and 
replicating to that DC and after data replication remove DSE cassandra DC
2) Upgrading from current DSE version to later version of cassandra possibly 
2.2 or 3.x

Has anyone did it before any issues faced ??







Thanks
Pranay.


Re: EXT: Start Cassandra with Gossip disabled ?

2017-04-14 Thread Harikrishnan Pillai
Yes , you can disable gossip and disable binary to cut off All traffic but 
process like compactions can still continue .

Sent from my iPhone

> On Apr 11, 2017, at 9:21 AM, Biscuit Ninja  wrote:
> 
> We run an 8 node Cassandra v2.1.16 cluster (4 nodes in two discrete 
> datacentres) and we're currently investigating a problem where by restarting 
> Cassandra on a node resulted in the filling of Eden/Survivor/Old and frequent 
> GCs.
> 
> http://imgur.com/a/OR1dk
> 
> This hammered reads from our application tier (writes seemed okay) and until 
> we determine what the root cause is, we'd like to be able to start Cassandra 
> with gossip disabled.
> 
> Is this possible?
> Thanks
> .bN


Re: compaction falling behind

2017-02-13 Thread Harikrishnan Pillai
If your compaction strategy is Leveled the number of sstables in each level is 
a good indication that compactions  are keeping up.


From: Ben Bromhead 
Sent: Monday, February 13, 2017 1:49:05 PM
To: user
Subject: Re: compaction falling behind


You can do so in two ways:

1) direct observation:
You can keep an eye on the number of pending compactions. This will fluctuate 
with load, compaction strategy, ongoing repairs and nodes bootstrapping but 
generally the pattern is it should trend towards 0.

There have been a number of bugs in past versions of Cassandra whereby the 
number of pending compactions is not reported correctly, so depending on what 
version of Cassandra you run this could impact you.

2) Indirect observation
You can keep an eye on metrics that healthy compaction will directly contribute 
to. These include the number of sstables per read histogram, estimated 
droppable tombstones, tombstones per read etc. You should keep an eye on these 
things anyway as they can often show you areas where you can fine tune 
compaction or your data model.

Everything exposed by nodetool is consumable via JMX which is great to plug 
into your metrics/monitoring/observability system :)

On Mon, 13 Feb 2017 at 13:23 John Sanda 
> wrote:
What is a good way to determine whether or not compaction is falling behind? I 
read a couple things earlier that suggest nodetool compactionstats might not be 
the most reliable thing to use.



- John
--
Ben Bromhead
CTO | Instaclustr
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Error when running nodetool cleanup after adding a new node to a cluster

2017-02-08 Thread Harikrishnan Pillai
The cleanup has to run on other nodes

Sent from my iPhone

On Feb 8, 2017, at 9:14 PM, Srinath Reddy 
> wrote:

Hi,

Trying to re-balacne a Cassandra cluster after adding a new node and I'm 
getting this error when running nodetool cleanup. The Cassandra cluster is 
running in a Kubernetes cluster.

Cassandra version is 2.2.8

nodetool cleanup
error: io.k8s.cassandra.KubernetesSeedProvider
Fatal configuration error; unable to start server.  See log for stacktrace.
-- StackTrace --
org.apache.cassandra.exceptions.ConfigurationException: 
io.k8s.cassandra.KubernetesSeedProvider
Fatal configuration error; unable to start server.  See log for stacktrace.
at 
org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:676)
at 
org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:119)
at org.apache.cassandra.tools.NodeProbe.checkJobs(NodeProbe.java:256)
at org.apache.cassandra.tools.NodeProbe.forceKeyspaceCleanup(NodeProbe.java:262)
at org.apache.cassandra.tools.nodetool.Cleanup.execute(Cleanup.java:55)
at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:244)
at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:158)

nodetool status
Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens   Owns (effective)  Host ID  
 Rack
UN  10.244.3.4   6.91 GB256  60.8% 
bad1c6c6-8c2e-4f0c-9aea-0d63b451e7a1  rack1
UN  10.244.0.3   6.22 GB256  60.2% 
936cb0c0-d14f-4ddd-bfde-3865b922e267  rack1
UN  10.244.1.3   6.12 GB256  59.4% 
0cb43711-b155-449c-83ba-00ed2a97affe  rack1
UN  10.244.4.3   632.43 MB  256  57.8% 
55095c75-26df-4180-9004-9fabf88faacc  rack1
UN  10.244.2.10  6.08 GB256  61.8% 
32e32bd2-364f-4b6f-b13a-8814164ed160  rack1


Any suggestions on what is needed to re-balance the cluster after adding the 
new node? I have run nodetool repair but not able to run nodetool cleanup.

Thanks.




Re: Is it possible to have a column which can hold any data type (for inserting as json)

2017-02-01 Thread Harikrishnan Pillai
When you run a cql query like select Json from table where pk=?  , you will get 
the value which is a full Json .but if you have a requirement to query the Json 
by using some fields inside Json ,you have to create additional columns for 
that fields and create a secondary index on it .
Then you can query
select Json from table where address=?,assuming you created a secondary index 
on address column .
Regards
Hari

Sent from my iPhone

On Feb 1, 2017, at 9:18 PM, Rajeswari Menon 
<rajeswar...@thinkpalm.com<mailto:rajeswar...@thinkpalm.com>> wrote:

Could you please help me on this. I am a newbie in Cassandra. So If I need to 
add json as a String, I can define the table as below.

create table data
(
  id int primary key,
  json text
);

The insert query will be as follows:

insert into data (id, json) values (1, '{
   "address":"127.0.0.1",
   "datatype":"DOUBLE",
   "name":"Longitude",
   "attributes":{
  "Id":"1"
   },
   "category":"REAL",
   "value":1.390692,
   "timestamp":1485923271718,
   "quality":"GOOD"
}');

Now how can I query the value field?


From: Harikrishnan Pillai [mailto:hpil...@walmartlabs.com]
Sent: 02 February 2017 10:18
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Is it possible to have a column which can hold any data type (for 
inserting as json)

You can create additional columns and create secondary index based on fields 
you want to query .
Best option is store full Json in Cassandra and index fields you want to query 
on in solr .

Sent from my iPhone

On Feb 1, 2017, at 8:41 PM, Rajeswari Menon 
<rajeswar...@thinkpalm.com<mailto:rajeswar...@thinkpalm.com>> wrote:
Yes. I know that. My intension is to do an aggregate query on value field (in 
json). Will that be possible if I store the entire json as String? I will have 
to parse it according to my need right?

Regards,
Rajeswari

From: Harikrishnan Pillai [mailto:hpil...@walmartlabs.com]
Sent: 02 February 2017 10:08
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Is it possible to have a column which can hold any data type (for 
inserting as json)

You can use text type in Cassandra and store the full Json  string .

Sent from my iPhone

On Feb 1, 2017, at 8:30 PM, Rajeswari Menon 
<rajeswar...@thinkpalm.com<mailto:rajeswar...@thinkpalm.com>> wrote:
Yes. Is there any way to define value to accept any data type as the json value 
data may vary? Or is there any way to do the same without defining a schema?

Regards,
Rajeswari

From: Benjamin Roth [mailto:benjamin.r...@jaumo.com]
Sent: 01 February 2017 15:36
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Is it possible to have a column which can hold any data type (for 
inserting as json)

Value is defined as text column and you try to insert a double. That's simply 
not allowed

Am 01.02.2017 09:02 schrieb "Rajeswari Menon" 
<rajeswar...@thinkpalm.com<mailto:rajeswar...@thinkpalm.com>>:
Given below is the sql query I executed.

insert into data JSON'{
  "id": 1,
   "address":"",
   "datatype":"DOUBLE",
   "name":"Longitude",
   "attributes":{
  "ID":"1"
   },
   "category":"REAL",
   "value":1.390692,
   "timestamp":1485923271718,
   "quality":"GOOD"
}';

Regards,
Rajeswari

From: Benjamin Roth 
[mailto:benjamin.r...@jaumo.com<mailto:benjamin.r...@jaumo.com>]
Sent: 01 February 2017 12:35
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Is it possible to have a column which can hold any data type (for 
inserting as json)

You should post the whole CQL query you try to execute! Why don't you use a 
native JSON type for your JSON data?

2017-02-01 7:51 GMT+01:00 Rajeswari Menon 
<rajeswar...@thinkpalm.com<mailto:rajeswar...@thinkpalm.com>>:
Hi,

I have a json data as shown below.

{
"address":"127.0.0.1",
"datatype":"DOUBLE",
"name":"Longitude",
 "attributes":{
"Id":"1"
},
"category":"REAL",
"value":1.390692,
"timestamp":1485923271718,
"quality":"GOOD"
}

To store the above json to Cassandra, I defined a table as shown below

create table data
(
  id int primary key,
  address text,
  

Re: Is it possible to have a column which can hold any data type (for inserting as json)

2017-02-01 Thread Harikrishnan Pillai
You can create additional columns and create secondary index based on fields 
you want to query .
Best option is store full Json in Cassandra and index fields you want to query 
on in solr .

Sent from my iPhone

On Feb 1, 2017, at 8:41 PM, Rajeswari Menon 
<rajeswar...@thinkpalm.com<mailto:rajeswar...@thinkpalm.com>> wrote:

Yes. I know that. My intension is to do an aggregate query on value field (in 
json). Will that be possible if I store the entire json as String? I will have 
to parse it according to my need right?

Regards,
Rajeswari

From: Harikrishnan Pillai [mailto:hpil...@walmartlabs.com]
Sent: 02 February 2017 10:08
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Is it possible to have a column which can hold any data type (for 
inserting as json)

You can use text type in Cassandra and store the full Json  string .

Sent from my iPhone

On Feb 1, 2017, at 8:30 PM, Rajeswari Menon 
<rajeswar...@thinkpalm.com<mailto:rajeswar...@thinkpalm.com>> wrote:
Yes. Is there any way to define value to accept any data type as the json value 
data may vary? Or is there any way to do the same without defining a schema?

Regards,
Rajeswari

From: Benjamin Roth [mailto:benjamin.r...@jaumo.com]
Sent: 01 February 2017 15:36
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Is it possible to have a column which can hold any data type (for 
inserting as json)

Value is defined as text column and you try to insert a double. That's simply 
not allowed

Am 01.02.2017 09:02 schrieb "Rajeswari Menon" 
<rajeswar...@thinkpalm.com<mailto:rajeswar...@thinkpalm.com>>:
Given below is the sql query I executed.

insert into data JSON'{
  "id": 1,
   "address":"",
   "datatype":"DOUBLE",
   "name":"Longitude",
   "attributes":{
  "ID":"1"
   },
   "category":"REAL",
   "value":1.390692,
   "timestamp":1485923271718,
   "quality":"GOOD"
}';

Regards,
Rajeswari

From: Benjamin Roth 
[mailto:benjamin.r...@jaumo.com<mailto:benjamin.r...@jaumo.com>]
Sent: 01 February 2017 12:35
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Is it possible to have a column which can hold any data type (for 
inserting as json)

You should post the whole CQL query you try to execute! Why don't you use a 
native JSON type for your JSON data?

2017-02-01 7:51 GMT+01:00 Rajeswari Menon 
<rajeswar...@thinkpalm.com<mailto:rajeswar...@thinkpalm.com>>:
Hi,

I have a json data as shown below.

{
"address":"127.0.0.1",
"datatype":"DOUBLE",
"name":"Longitude",
 "attributes":{
"Id":"1"
},
"category":"REAL",
"value":1.390692,
"timestamp":1485923271718,
"quality":"GOOD"
}

To store the above json to Cassandra, I defined a table as shown below

create table data
(
  id int primary key,
  address text,
  datatype text,
  name text,
  attributes map < text, text >,
  category text,
  value text,
  "timestamp" timestamp,
  quality text
);

When I try to insert the data as JSON I got the error : Error decoding JSON 
value for value: Expected a UTF-8 string, but got a Double: 1.390692. The 
message is clear that a double value cannot be inserted to text column. The 
real issue is that the value can be of any data type, so the schema cannot be 
predefined. Is there a way to create a column which can hold value of any data 
type. (I don't want to hold the entire json as string. My preferred way is to 
define a schema.)

Regards,
Rajeswari



--
Benjamin Roth
Prokurist

Jaumo GmbH * www.jaumo.com<http://www.jaumo.com>
Wehrstra?e 46 * 73035 G?ppingen * Germany
Phone +49 7161 304880-6<tel:07161%203048806> * Fax +49 7161 
304880-1<tel:07161%203048801>
AG Ulm * HRB 731058 * Managing Director: Jens Kammerer


Re: Is it possible to have a column which can hold any data type (for inserting as json)

2017-02-01 Thread Harikrishnan Pillai
You can use text type in Cassandra and store the full Json  string .

Sent from my iPhone

On Feb 1, 2017, at 8:30 PM, Rajeswari Menon 
> wrote:

Yes. Is there any way to define value to accept any data type as the json value 
data may vary? Or is there any way to do the same without defining a schema?

Regards,
Rajeswari

From: Benjamin Roth [mailto:benjamin.r...@jaumo.com]
Sent: 01 February 2017 15:36
To: user@cassandra.apache.org
Subject: RE: Is it possible to have a column which can hold any data type (for 
inserting as json)

Value is defined as text column and you try to insert a double. That's simply 
not allowed

Am 01.02.2017 09:02 schrieb "Rajeswari Menon" 
>:
Given below is the sql query I executed.

insert into data JSON'{
  "id": 1,
   "address":"",
   "datatype":"DOUBLE",
   "name":"Longitude",
   "attributes":{
  "ID":"1"
   },
   "category":"REAL",
   "value":1.390692,
   "timestamp":1485923271718,
   "quality":"GOOD"
}';

Regards,
Rajeswari

From: Benjamin Roth 
[mailto:benjamin.r...@jaumo.com]
Sent: 01 February 2017 12:35
To: user@cassandra.apache.org
Subject: Re: Is it possible to have a column which can hold any data type (for 
inserting as json)

You should post the whole CQL query you try to execute! Why don't you use a 
native JSON type for your JSON data?

2017-02-01 7:51 GMT+01:00 Rajeswari Menon 
>:
Hi,

I have a json data as shown below.

{
"address":"127.0.0.1",
"datatype":"DOUBLE",
"name":"Longitude",
 "attributes":{
"Id":"1"
},
"category":"REAL",
"value":1.390692,
"timestamp":1485923271718,
"quality":"GOOD"
}

To store the above json to Cassandra, I defined a table as shown below

create table data
(
  id int primary key,
  address text,
  datatype text,
  name text,
  attributes map < text, text >,
  category text,
  value text,
  "timestamp" timestamp,
  quality text
);

When I try to insert the data as JSON I got the error : Error decoding JSON 
value for value: Expected a UTF-8 string, but got a Double: 1.390692. The 
message is clear that a double value cannot be inserted to text column. The 
real issue is that the value can be of any data type, so the schema cannot be 
predefined. Is there a way to create a column which can hold value of any data 
type. (I don't want to hold the entire json as string. My preferred way is to 
define a schema.)

Regards,
Rajeswari



--
Benjamin Roth
Prokurist

Jaumo GmbH * www.jaumo.com
Wehrstra?e 46 * 73035 G?ppingen * Germany
Phone +49 7161 304880-6 * Fax +49 7161 
304880-1
AG Ulm * HRB 731058 * Managing Director: Jens Kammerer


Re: Re : Decommissioned nodes show as DOWN in Cassandra versions 2.1.12 - 2.1.16

2017-01-27 Thread Harikrishnan Pillai
Please remove the ips from the system.peer table of all nodes  or you can use 
unsafeassasinate from JMX.



From: Agrawal, Pratik 
Sent: Friday, January 27, 2017 9:05:43 AM
To: user@cassandra.apache.org; k...@instaclustr.com; pskraj...@gmail.com
Cc: Sun, Guan
Subject: Re: Re : Decommissioned nodes show as DOWN in Cassandra versions 
2.1.12 - 2.1.16

We are seeing the same issue with Cassandra 2.0.8. The nodetool gossipinfo 
reports a node being down even after we decommission the node from the cluster.

Thanks,
Pratik

From: kurt greaves >
Reply-To: "user@cassandra.apache.org" 
>
Date: Friday, January 27, 2017 at 5:54 AM
To: "user@cassandra.apache.org" 
>
Subject: Re: Re : Decommissioned nodes show as DOWN in Cassandra versions 
2.1.12 - 2.1.16

we've seen this issue on a few clusters, including on 2.1.7 and 2.1.8. pretty 
sure it is an issue in gossip that's known about. in later versions it seems to 
be fixed.

On 24 Jan 2017 06:09, "sai krishnam raju potturi" 
> wrote:

In the Cassandra versions 2.1.11 - 2.1.16, after we decommission a node or 
datacenter, we observe the decommissioned nodes marked as DOWN in the cluster 
when you do a "nodetool describecluster". The nodes however do not show up in 
the "nodetool status" command.
The decommissioned node also does not show up in the "system_peers" table on 
the nodes.

The workaround we follow is rolling restart of the cluster, which removes the 
decommissioned nodes from the "UNREACHABLE STATE", and shows the actual state 
of the cluster. The workaround is tedious for huge clusters.

We also verified the decommission process in CCM tool, and observed the same 
issue for clusters with versions from 2.1.12 to 2.1.16. The issue was not 
observed in versions prior to or later than the ones mentioned above.


Has anybody in the community observed similar issue? We've also raised a JIRA 
issue regarding this.   https://issues.apache.org/jira/browse/CASSANDRA-13144


Below are the observed logs from the versions without the bug, and with the 
bug.  The one's highlighted in yellow show the expected logs. The one's 
highlighted in red are the one's where the node is recognized as down, and 
shows as UNREACHABLE.



Cassandra 2.1.1 Logs showing the decommissioned node :  (Without the bug)

2017-01-19 20:18:56,415 [GossipStage:1] DEBUG ArrivalWindow Ignoring interval 
time of 2049943233 for /X.X.X.X
2017-01-19 20:18:56,416 [GossipStage:1] DEBUG StorageService Node /X.X.X.X 
state left, tokens [ 59353109817657926242901533144729725259, 
60254520910109313597677907197875221475, 75698727618038614819889933974570742305, 
84508739091270910297310401957975430578]
2017-01-19 20:18:56,416 [GossipStage:1] DEBUG Gossiper adding expire time for 
endpoint : /X.X.X.X (1485116334088)
2017-01-19 20:18:56,417 [GossipStage:1] INFO StorageService Removing tokens 
[100434964734820719895982857900842892337, 
114144647582686041354301802358217767299, 
13209060517964702932350041942412177, 
138409460913927199437556572481804704749] for /X.X.X.X
2017-01-19 20:18:56,418 [HintedHandoff:3] INFO HintedHandOffManager Deleting 
any stored hints for /X.X.X.X
2017-01-19 20:18:56,424 [GossipStage:1] DEBUG MessagingService Resetting 
version for /X.X.X.X
2017-01-19 20:18:56,424 [GossipStage:1] DEBUG Gossiper removing endpoint 
/X.X.X.X
2017-01-19 20:18:56,437 [GossipStage:1] DEBUG StorageService Ignoring state 
change for dead or unknown endpoint: /X.X.X.X
2017-01-19 20:19:02,022 [WRITE-/X.X.X.X] DEBUG OutboundTcpConnection attempting 
to connect to /X.X.X.X
2017-01-19 20:19:02,023 [HANDSHAKE-/X.X.X.X] INFO OutboundTcpConnection 
Handshaking version with /X.X.X.X
2017-01-19 20:19:02,023 [WRITE-/X.X.X.X] DEBUG MessagingService Setting version 
7 for /X.X.X.X
2017-01-19 20:19:08,096 [GossipStage:1] DEBUG ArrivalWindow Ignoring interval 
time of 2074454222 for /X.X.X.X
2017-01-19 20:19:54,407 [GossipStage:1] DEBUG ArrivalWindow Ignoring interval 
time of 4302985797 for /X.X.X.X
2017-01-19 20:19:57,405 [GossipTasks:1] DEBUG Gossiper 6 elapsed, /X.X.X.X 
gossip quarantine over
2017-01-19 20:19:57,455 [GossipStage:1] DEBUG ArrivalWindow Ignoring interval 
time of 3047826501 for /X.X.X.X
2017-01-19 20:19:57,455 [GossipStage:1] DEBUG StorageService Ignoring state 
change for dead or unknown endpoint: /X.X.X.X


Cassandra 2.1.16 Logs showing the decommissioned node :   (The logs in 2.1.16 
show the same as 2.1.1 upto "DEBUG Gossiper 6 elapsed, /X.X.X.X gossip 
quarantine over", and then is followed by "NODE is now DOWN"

017-01-19 19:52:23,687 [GossipStage:1] DEBUG 

Re: Has anyone deployed a production cluster with less than 6 nodes per DC?

2016-12-26 Thread Harikrishnan Pillai
1 million write per hour is around 250 writes per second .its easily achievable 
with 3 nodes .make sure that you have a good gc tuning and compaction tunings.

Sent from my iPhone

On Dec 26, 2016, at 1:27 PM, Ney, Richard 
> wrote:

My company has a product we're about to deploy into AWS with Cassandra setup as 
a two 3 node clusters in two availability zones (m4.2xlarge with 2 500GB EBS 
volumes per node). We're doing over a million writes per hour with the cluster 
setup with R-2 and local quorum writes. We run successfully for several hours 
before Cassandra goes into the weeds and we start getting write timeouts to the 
point we must kill the Cassandra JVM processes to get the Cassandra cluster to 
restart. I keep raising to my upper management that the cluster is severely 
undersized but management is complaining that setting up 12 nodes is too 
expensive and to change the code to reduce load on Cassandra.

So, the main question is "Is there any hope of success with a 3 node DC setup 
of Cassandra in production or are we on a fool's errand?"

RICHARD NEY
TECHNICAL DIRECTOR, RESEARCH & DEVELOPMENT
+1 (978) 848.6640 WORK
+1 (916) 846.2353 MOBILE
UNITED STATES
richard@aspect.com
aspect.com

[mailSigLogo-rev.jpg]
This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


Re: Handling Leap second delay

2016-12-20 Thread Harikrishnan Pillai
http://www.datastax.com/dev/blog/preparing-for-the-leap-second-2017

[http://www.datastax.com/wp-content/themes/datastax-2013/images/common/logo.png]

Preparing for the Leap Second, 2017 Jan 1 Edition | 
DataStax
www.datastax.com
Back in April of 2015 we published a blog post, ‘Preparing for the Leap 
Second’. In July, IERS announced another leap second which will take place on 
January 1st ...




From: Sanjeev T 
Sent: Tuesday, December 20, 2016 7:54:29 PM
To: user@cassandra.apache.org
Subject: Handling Leap second delay

Hi,

Can some of you share points on, the versions and handling leap second delay on 
Dec 31, 2016.

Regards
-Sanjeev



Re: Cassandra Different cluster gossiping to each other

2016-12-14 Thread Harikrishnan Pillai
This is possible if some of the nodes are available in system peer table of the 
other cluster.this usually occurs when we decommission nodes from one cluster 
and add to another cluster.also make sure that before adding a node newly to a 
cluster all data in drives are properly wiped out .

Sent from my iPhone

On Dec 14, 2016, at 3:11 AM, Abhishek Kumar Maheshwari 
>
 wrote:

Hi All,

I am getting below log in my system.log


GossipDigestSynVerbHandler.java:52 - ClusterName mismatch from /192.XXX.AA.133 
QA Columbia Cluster! = QA Columbia Cluster new

192.XXX.AA.133 Cluster name is QA Columbia Cluster
And on which server I am getting this error cluster name is: QA Columbia 
Cluster new

I am using apache-cassandra-2.2.3. Please let me know how I can fix this.



Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

A must visit exhibition for all Fitness and Sports Freaks. TOI Global Sports 
Business Show from 21 to 23 December 2016 Bombay Exhibition Centre, Mumbai. 
Meet the legends Kaizzad Capadia, Bhaichung Bhutia and more. Join the workshops 
on Boxing & Football and more. www.TOI-GSBS.com


Re: Huge files in level 1 and level 0 of LeveledCompactionStrategy

2016-12-07 Thread Harikrishnan Pillai
This can happen as part of node bootstrap,repair or rebuild node.


From: Sotirios Delimanolis 
Sent: Wednesday, December 7, 2016 4:35:45 PM
To: User
Subject: Huge files in level 1 and level 0 of LeveledCompactionStrategy

I have a couple of SSTables that are humongous

-rw-r--r-- 1 user group 138933736915 Dec  1 03:41 lb-29677471-big-Data.db
-rw-r--r-- 1 user group  78444316655 Dec  1 03:58 lb-29677495-big-Data.db
-rw-r--r-- 1 user group 212429252597 Dec  1 08:20 lb-29678145-big-Data.db

sstablemetadata reports that these are all in SSTable Level 0. This table is 
running with

compaction = {'sstable_size_in_mb': '200', 'tombstone_threshold': '0.25', 
'tombstone_compaction_interval': '300', 'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}

How could this happen?


Re: Which version is stable enough for production environment?

2016-11-30 Thread Harikrishnan Pillai
https://issues.apache.org/jira/browse/CASSANDRA-12728

[CASSANDRA-12728] Handling partially written hint files 
...<https://issues.apache.org/jira/browse/CASSANDRA-12728>
issues.apache.org
Cassandra; CASSANDRA-12728; Handling partially written hint files. Agile Board; 
Awaiting Feedback; Export

https://issues.apache.org/jira/browse/CASSANDRA-12844


Also when i testes some of our write heavy workload Leveled Compaction was not 
keeping up.With same system settings 2.1.16 performs better and all levels was 
properly aligned.


From: Benjamin Roth <benjamin.r...@jaumo.com>
Sent: Tuesday, November 29, 2016 11:20:19 PM
To: user@cassandra.apache.org
Subject: Re: Which version is stable enough for production environment?

What are the compaction issues / hint corruprions you encountered? Are there 
JIRA tickets for it?
I am curios cause I use 3.10 (trunk) in production.

For anyone who is planning to use MVs:
They basically work. We use them in production since some months, BUT (it's a 
quite big one) maintainance is a pain. Bootstrapping and repairs may be - 
depending on the model, config, amount of data - really, really painful. I'm 
currently investigating intensively.

2016-11-30 3:11 GMT+01:00 Harikrishnan Pillai 
<hpil...@walmartlabs.com<mailto:hpil...@walmartlabs.com>>:

3.0 has "off the heap memtable" impl removed and if you have a requirement for 
this,its not available.If you don't have the requirement 3.0.9 can be tried 
out. 3.9 version we did some testing and find lot issues in compaction,hint 
corruption etc.

Regards

Hari



From: Discovery <wl_...@qq.com<mailto:wl_...@qq.com>>
Sent: Tuesday, November 29, 2016 5:59 PM
To: user
Subject: Re: Which version is stable enough for production environment?

Why version 3.x is not recommended?  Thanks.


-- Original --
From:  "Harikrishnan 
Pillai";<hpil...@walmartlabs.com<mailto:hpil...@walmartlabs.com>>;
Date:  Wed, Nov 30, 2016 09:57 AM
To:  "user"<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>;
Subject:  Re: Which version is stable enough for production environment?


Cassandra 2.1.16



From: Discovery <wl_...@qq.com<mailto:wl_...@qq.com>>
Sent: Tuesday, November 29, 2016 5:42 PM
To: user
Subject: Which version is stable enough for production environment?

Hi Cassandra Experts,

  We prepare to deploy Cassandra in production env, but we can 
not confirm which version is stable and recommended, could someone in this mail 
list give the suggestion? Thanks in advance!


Best Regards
Discovery
11/30/2016



--
Benjamin Roth
Prokurist

Jaumo GmbH * www.jaumo.com<http://www.jaumo.com>
Wehrstra?e 46 * 73035 G?ppingen * Germany
Phone +49 7161 304880-6 * Fax +49 7161 304880-1
AG Ulm * HRB 731058 * Managing Director: Jens Kammerer


Re: Which version is stable enough for production environment?

2016-11-29 Thread Harikrishnan Pillai
3.0 has "off the heap memtable" impl removed and if you have a requirement for 
this,its not available.If you don't have the requirement 3.0.9 can be tried 
out. 3.9 version we did some testing and find lot issues in compaction,hint 
corruption etc.

Regards

Hari



From: Discovery <wl_...@qq.com>
Sent: Tuesday, November 29, 2016 5:59 PM
To: user
Subject: Re: Which version is stable enough for production environment?

Why version 3.x is not recommended?  Thanks.


-- Original ------
From:  "Harikrishnan Pillai";<hpil...@walmartlabs.com>;
Date:  Wed, Nov 30, 2016 09:57 AM
To:  "user"<user@cassandra.apache.org>;
Subject:  Re: Which version is stable enough for production environment?


Cassandra 2.1.16



From: Discovery <wl_...@qq.com>
Sent: Tuesday, November 29, 2016 5:42 PM
To: user
Subject: Which version is stable enough for production environment?

Hi Cassandra Experts,

  We prepare to deploy Cassandra in production env, but we can 
not confirm which version is stable and recommended, could someone in this mail 
list give the suggestion? Thanks in advance!


Best Regards
Discovery
11/30/2016


Re: Which version is stable enough for production environment?

2016-11-29 Thread Harikrishnan Pillai
Cassandra 2.1.16



From: Discovery 
Sent: Tuesday, November 29, 2016 5:42 PM
To: user
Subject: Which version is stable enough for production environment?

Hi Cassandra Experts,

  We prepare to deploy Cassandra in production env, but we can 
not confirm which version is stable and recommended, could someone in this mail 
list give the suggestion? Thanks in advance!


Best Regards
Discovery
11/30/2016


Re: Java GC pauses, reality check

2016-11-28 Thread Harikrishnan Pillai
Hi @Kant Kodali,

11 /11 , 11 nodes in DC1 and 11 nodes in DC2.



From: Kant Kodali <k...@peernova.com>
Sent: Monday, November 28, 2016 6:56 AM
To: user@cassandra.apache.org
Subject: Re: Java GC pauses, reality check

Hi Hari,

I am a little bit confused.

What you mean 11/11 ?

"We are using g1GC in most clusters with 26GB heap and extra threads given to 
parallel and old gen collection. Those clusters 99% is also under 5 ms and 
doing good". So with G1GC you are able get under 5ms not the C4 (Zing's Garbage 
Collector?)

What timeouts are you referring to here?

Thanks,
kant

On Sun, Nov 27, 2016 at 9:57 PM, Harikrishnan Pillai 
<hpil...@walmartlabs.com<mailto:hpil...@walmartlabs.com>> wrote:

Hi @Kant Kodali,

We have multiple clusters running zing .

One cluster has 11/11 and another one also has 11/11.(190 GB mem,6TB hard disk 
and 16 Physical core machines)

The average read size is around 200KB and it can go upto 6 MB.

We are using g1GC in most clusters with 26GB heap and extra threads given to 
parallel and old gen collection. Those clusters 99% is also under 5 ms and 
doing good. We used Zing to remove all timeouts . If application is not having 
that requirement G1GC is good.

with g1gGC i have seen average 200-300 ms min pauses every 4 minutes and 600 ms 
pauses every 6 hours and 99% latency is under 5-10 ms for most of the clusters 
having 10- 100 KB of read data.

Regards

Hari


From: Kant Kodali <k...@peernova.com<mailto:k...@peernova.com>>
Sent: Saturday, November 26, 2016 8:39:01 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Java GC pauses, reality check

@Harikrishnan Pillai: How many nodes you guys are running? and what is an 
approximate read size and an approximate write size?

On Fri, Nov 25, 2016 at 7:32 PM, Harikrishnan Pillai 
<hpil...@walmartlabs.com<mailto:hpil...@walmartlabs.com>> wrote:
We are running azul zing in prod with 1 million reads/s and 100 K writes/s with 
azul .we never had a major gc above 10 ms .

Sent from my iPhone

> On Nov 25, 2016, at 3:49 PM, Martin Schröder 
> <mar...@oneiros.de<mailto:mar...@oneiros.de>> wrote:
>
> 2016-11-25 23:38 GMT+01:00 Kant Kodali 
> <k...@peernova.com<mailto:k...@peernova.com>>:
>> I would also restate the following sentence "java GC pauses are pretty much
>> a fact of life" to "Any GC based system pauses are pretty much a fact of
>> life".
>>
>> I would be more than happy to see if someone can counter prove.
>
> Azul disagrees.
> https://www.azul.com/products/zing/pgc/
>
> Best
>   Martin




Re: Java GC pauses, reality check

2016-11-27 Thread Harikrishnan Pillai
Hi @Kant Kodali,

We have multiple clusters running zing .

One cluster has 11/11 and another one also has 11/11.(190 GB mem,6TB hard disk 
and 16 Physical core machines)

The average read size is around 200KB and it can go upto 6 MB.

We are using g1GC in most clusters with 26GB heap and extra threads given to 
parallel and old gen collection. Those clusters 99% is also under 5 ms and 
doing good. We used Zing to remove all timeouts . If application is not having 
that requirement G1GC is good.

with g1gGC i have seen average 200-300 ms min pauses every 4 minutes and 600 ms 
pauses every 6 hours and 99% latency is under 5-10 ms for most of the clusters 
having 10- 100 KB of read data.

Regards

Hari


From: Kant Kodali <k...@peernova.com>
Sent: Saturday, November 26, 2016 8:39:01 PM
To: user@cassandra.apache.org
Subject: Re: Java GC pauses, reality check

@Harikrishnan Pillai: How many nodes you guys are running? and what is an 
approximate read size and an approximate write size?

On Fri, Nov 25, 2016 at 7:32 PM, Harikrishnan Pillai 
<hpil...@walmartlabs.com<mailto:hpil...@walmartlabs.com>> wrote:
We are running azul zing in prod with 1 million reads/s and 100 K writes/s with 
azul .we never had a major gc above 10 ms .

Sent from my iPhone

> On Nov 25, 2016, at 3:49 PM, Martin Schr?der 
> <mar...@oneiros.de<mailto:mar...@oneiros.de>> wrote:
>
> 2016-11-25 23:38 GMT+01:00 Kant Kodali 
> <k...@peernova.com<mailto:k...@peernova.com>>:
>> I would also restate the following sentence "java GC pauses are pretty much
>> a fact of life" to "Any GC based system pauses are pretty much a fact of
>> life".
>>
>> I would be more than happy to see if someone can counter prove.
>
> Azul disagrees.
> https://www.azul.com/products/zing/pgc/
>
> Best
>   Martin



Re: Java GC pauses, reality check

2016-11-25 Thread Harikrishnan Pillai
We are running azul zing in prod with 1 million reads/s and 100 K writes/s with 
azul .we never had a major gc above 10 ms .

Sent from my iPhone

> On Nov 25, 2016, at 3:49 PM, Martin Schröder  wrote:
> 
> 2016-11-25 23:38 GMT+01:00 Kant Kodali :
>> I would also restate the following sentence "java GC pauses are pretty much
>> a fact of life" to "Any GC based system pauses are pretty much a fact of
>> life".
>> 
>> I would be more than happy to see if someone can counter prove.
> 
> Azul disagrees.
> https://www.azul.com/products/zing/pgc/
> 
> Best
>   Martin


Re: Java GC pauses, reality check

2016-11-25 Thread Harikrishnan Pillai
Zing jvm reduces the pause under 10ms for most use cases.

Sent from my iPhone

On Nov 25, 2016, at 2:44 PM, Kant Kodali 
> wrote:

+1 Chris Lohfink response

I would also restate the following sentence "java GC pauses are pretty much a 
fact of life" to "Any GC based system pauses are pretty much a fact of life".

I would be more than happy to see if someone can counter prove.



On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink 
> wrote:
No tuning will eliminate gcs.

20-30 seconds is horrific and out of the ordinary. Most likely implementing 
antipatterns and/or poorly configured. Sub 1s is realistic but with some 
workloads still may require some tuning to maintain. Some workloads are very 
unfriendly to GCs though (ie heavy tombstones, very wide partitions).

Chris

On Fri, Nov 25, 2016 at 3:25 PM, S Ahmed 
> wrote:
Hello!

>From what I understand java GC pauses are pretty much a fact of life, but you 
>can tune the jvm to reduce the likelihood of the frequency and length of GC 
>pauses.

When using Cassandra, how frequent or long have these pauses known to be?  Even 
with tuning, is it safe to assume they cannot be eliminated?

Would a 20-30 second pause be something out of the ordinary?

Thanks.




Re: Cannot set TTL in COPY command

2016-10-26 Thread Harikrishnan Pillai
i have created a Jira for Cassandra version 3.9.Anyone have seen this scenario 
before in any 3.X version.

https://issues.apache.org/jira/browse/CASSANDRA-12844

Regards

Hari


From: Lahiru Gamathige 
Sent: Wednesday, October 26, 2016 10:46:51 AM
To: user@cassandra.apache.org
Subject: Re: Cannot set TTL in COPY command

Highly recommend to move to a newer Cassandra version first because TTL and 
compaction are much more consistent.
On Wed, Oct 26, 2016 at 10:36 AM, Tyler Hobbs 
> wrote:

On Wed, Oct 26, 2016 at 10:07 AM, techpyaasa . 
> wrote:
Can some one please tell me how to set TTL using COPY command?

It looks like you're using Cassandra 2.0.  I don't think COPY supports the TTL 
option until at least 2.1.


--
Tyler Hobbs
DataStax



Node tool drain causing hint corruption

2016-10-26 Thread Harikrishnan Pillai
Changing the subject

Sent from my iPhone

On Oct 26, 2016, at 12:28 PM, Harikrishnan Pillai 
<hpil...@walmartlabs.com<mailto:hpil...@walmartlabs.com>> wrote:


i have created a Jira for Cassandra version 3.9.Anyone have seen this scenario 
before in any 3.X version.

https://issues.apache.org/jira/browse/CASSANDRA-12844

Regards

Hari


From: Lahiru Gamathige <lah...@highfive.com<mailto:lah...@highfive.com>>
Sent: Wednesday, October 26, 2016 10:46:51 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Cannot set TTL in COPY command

Highly recommend to move to a newer Cassandra version first because TTL and 
compaction are much more consistent.
On Wed, Oct 26, 2016 at 10:36 AM, Tyler Hobbs 
<ty...@datastax.com<mailto:ty...@datastax.com>> wrote:

On Wed, Oct 26, 2016 at 10:07 AM, techpyaasa . 
<techpya...@gmail.com<mailto:techpya...@gmail.com>> wrote:
Can some one please tell me how to set TTL using COPY command?

It looks like you're using Cassandra 2.0.  I don't think COPY supports the TTL 
option until at least 2.1.


--
Tyler Hobbs
DataStax<http://datastax.com/>



Re: Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Harikrishnan Pillai
We use Cassandra to store images .any data above 2 mb we chunk it and store.it 
works perfectly .

Sent from my iPhone

> On Oct 20, 2016, at 12:09 PM, Vikas Jaiman  wrote:
> 
> Hi,
> 
> Normally people would like to store smaller values in Cassandra. Is there 
> anyone using it to store for larger values (e.g 500KB or more) and if so what 
> are the issues you are facing . I Would like to know the tweaks also which 
> you are considering.
> 
> Thanks,
> Vikas


Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-12 Thread Harikrishnan Pillai
In my experience dc local repair node by node with
Pr and par options is best .full repair increased sstables
A lot and take days to compact it back or another
Easy option for repair is use a spark job ,read all data with
Consistency all and increase read repair chance to
100 % or use Netflix tickler

Sent from my iPhone

On Oct 12, 2016, at 11:44 AM, Anuj Wadehra 
> wrote:

Hi Leena,

First thing you should be concerned about is : Why the repair -pr operation 
doesnt complete ?
Second comes the question : Which repair option is best?


One probable cause of stuck repairs is : if the firewall between DCs is closing 
TCP connections and Cassandra is trying to use such connections, repairs will 
hang. Please refer 
https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
 . We faced that.

Also make sure you comply with basic bandwidth requirement between DCs. 
Recommended is 1000 Mb/s (1 gigabit) or greater.

Answers for specific questions:
1.As per my understanding, all replicas will not participate in dc local 
repairs and thus repair would be ineffective. You need to make sure that all 
replicas of a data in all dcs are in sync.

2. Every DC is not a ring. All DCs together form a token ring. So, I think yes 
you should run repair -pr on all nodes.

3. Yes. I dont have experience with incremental repairs. But you can run repair 
-pr on all nodes of all DCs.

Regarding Best approach of repair, you should see some repair presentations of 
Cassandra Summit 2016. All are online now.

I attended the summit and people using large clusters generally use sub range 
repairs to repair their clusters. But such large deployments are on older 
Cassandra versions and these deployments generally dont use vnodes. So people 
know easily which nodes hold which token range.



Thanks
Anuj



From: Leena Ghatpande >;
To: user@cassandra.apache.org 
>;
Subject: Repair in Multi Datacenter - Should you use -dc Datacenter repair or 
repair with -pr
Sent: Wed, Oct 12, 2016 2:15:51 PM


Please advice. Cannot find any clear documentation on what is the best strategy 
for repairing nodes on a regular basis with multiple datacenters involved.


We are running cassandra 3.7 in multi datacenter with 4 nodes in each data 
center. We are trying to run repairs every other night to keep the nodes in 
good state.We currently run repair with -pr option , but the repair process 
gets hung and does not complete gracefully. Dont see any errors in the logs 
either.


What is the best way to perform repairs on multiple data centers on large 
tables.

1. Can we run Datacenter repair using -dc option for each data center? Do we 
need to run repair on each node in that case or will it repair all nodes within 
the datacenter?

2. Is running repair with -pr across all nodes required , if we perform the 
step 1 every night?

3. Is cross data center repair required and if so whats the best option?


Thanks


Leena