Re: Records in table after deleting sstable manually

2020-08-11 Thread Kunal
Thanks Jeff. Appreciate your reply. as you said , looks like some there
were entries in commitlogs and when cassandra was brought up after deleting
sstables, data from commitlog replayed. May be next time I will let the
replay happen after deleting sstable and then truncate table using CQL.
This will ensure my table is empty. I could not truncate from CQL in the
first place as one of the node was not up.

Regards,
Kunal

On Tue, Aug 11, 2020 at 8:45 AM Jeff Jirsa  wrote:

> The data probably came from either hints or commitlog replay.
>
> If you use `truncate` from CQL, it solves both of those concerns.
>
>
> On Tue, Aug 11, 2020 at 8:42 AM Kunal  wrote:
>
>> HI,
>>
>> We have a 3 nodes cassandra cluster and one of the table grew big, around
>> 2 gb while it was supposed to be few MBs. During nodetool repair, one of
>> the cassandra went down. Even after multiple restart, one of the node was
>> going down after coming up for few mins. We decided to truncate the table
>> by removing the corresponding sstable from the disk since truncating a
>> table from cqlsh needs all the nodes to be up which was not the case in our
>> env. After deleting sstable from disk on all the 3 nodes, we brought up
>> cassandra and all the nodes came up fine and dont see any issue , but we
>> observed the size of the sstable is~100MB which was bit strange and the
>> table has old rows (around 20K) from previous date, before removing the
>> rows were 500K. Not sure how the table has old records and sstable is of
>> ~100M even after removing the sstable.
>> Any ideas ? Any help to understand this would be appreciated.
>>
>> Regards,
>> Kunal
>>
>

-- 



Regards,
Kunal Vaid


Records in table after deleting sstable manually

2020-08-11 Thread Kunal
HI,

We have a 3 nodes cassandra cluster and one of the table grew big, around 2
gb while it was supposed to be few MBs. During nodetool repair, one of the
cassandra went down. Even after multiple restart, one of the node was going
down after coming up for few mins. We decided to truncate the table by
removing the corresponding sstable from the disk since truncating a table
from cqlsh needs all the nodes to be up which was not the case in our env.
After deleting sstable from disk on all the 3 nodes, we brought up
cassandra and all the nodes came up fine and dont see any issue , but we
observed the size of the sstable is~100MB which was bit strange and the
table has old rows (around 20K) from previous date, before removing the
rows were 500K. Not sure how the table has old records and sstable is of
~100M even after removing the sstable.
Any ideas ? Any help to understand this would be appreciated.

Regards,
Kunal


Re: Disabling Swap for Cassandra

2020-04-16 Thread Kunal
Thanks for the responses. Appreciae it.

@Dor, so you are saying if we add "memlock unlimited" in limits.conf, the
entire heap (Xms=Xmx) can be locked at startup ? Will this be applied to
all Java processes ?  We have couple of Java programs running with the same
owner.


Thanks
Kunal

On Thu, Apr 16, 2020 at 4:31 PM Dor Laor  wrote:

> It is good to configure swap for the OS but exempt Cassandra
> from swapping. Why is it good? Since you never know the
> memory utilization of additional agents and processes you or
> other admins will run on your server.
>
> So do configure a swap partition.
> You can control the eagerness of the kernel by the swappiness
> sysctl parameter. You can even control it per cgroup:
>
> https://askubuntu.com/questions/967588/how-can-i-prevent-certain-process-from-being-swapped
>
> You should make sure Cassandra locks its memory so the kernel
> won't choose its memory to be swapped out (since it will kill
> your latency). You do it by mlock. Read more on:
>
> https://stackoverflow.com/questions/578137/can-i-tell-linux-not-to-swap-out-a-particular-processes-memory
>
> The scylla /dist/common/limits.d/scylladb.com looks like this:
> scylla  -  core unlimited
> scylla  -  memlock  unlimited
> scylla  -  nofile   20
> scylla  -  as   unlimited
> scylla  -  nproc8096
>
> On Thu, Apr 16, 2020 at 3:57 PM Nitan Kainth 
> wrote:
> >
> > Swap is controlled by OS and will use it when running short of memory. I
> don’t think you can disable at Cassandra level
> >
> >
> > Regards,
> >
> > Nitan
> >
> > Cell: 510 449 9629
> >
> >
> > On Apr 16, 2020, at 5:50 PM, Kunal  wrote:
> >
> > 
> >
> > Hello,
> >
> >
> >
> > I need some suggestion from you all. I am new to Cassandra and was
> reading Cassandra best practices. On one document, it was mentioned that
> Cassandra should not be using swap, it degrades the performance.
> >
> > My question is instead of disabling swap system wide, can we force
> Cassandra not to use swap? Some documentation suggests to use
> memory_locking_policy in cassandra.yaml.
> >
> >
> > How do I check if our Cassandra already has this parameter and still
> uses swap ? Is there any way i can check this. I already checked
> cassandra.yaml and dont see this parameter. Is there any other place i can
> check and confirm?
> >
> >
> > Also, Can I set memlock parameter to unlimited (64kB default), so entire
> Heap (Xms = Xmx) can be locked at node startup ? Will that help?
> >
> >
> > Or if you have any other suggestions, please let me know.
> >
> >
> >
> >
> >
> > Regards,
> >
> > Kunal
> >
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

-- 



Regards,
Kunal Vaid


Disabling Swap for Cassandra

2020-04-16 Thread Kunal
Hello,



I need some suggestion from you all. I am new to Cassandra and was reading
Cassandra best practices. On one document, it was mentioned that Cassandra
should not be using swap, it degrades the performance.

My question is instead of disabling swap system wide, can we force
Cassandra not to use swap? Some documentation suggests to use
memory_locking_policy in cassandra.yaml.


How do I check if our Cassandra already has this parameter and still uses
swap ? Is there any way i can check this. I already checked cassandra.yaml
and dont see this parameter. Is there any other place i can check and
confirm?


Also, Can I set memlock parameter to unlimited (64kB default), so entire
Heap (Xms = Xmx) can be locked at node startup ? Will that help?


Or if you have any other suggestions, please let me know.





Regards,

Kunal


One of the cassandra node is going down.

2019-05-30 Thread Kunal
Hi All,

I am facing a situation in my 3 nodes cassandra wherein one of the
cassandra nodes is going down after around 5-10mins.

Below messages are seen in debug.log of node which is going down:

===

No Title

INFO  [ScheduledTasks:1] 2019-05-30 14:39:25,179 StatusLogger.java:101 -
system_schema.views  2,16

INFO  [Service Thread] 2019-05-30 14:39:25,182 StatusLogger.java:101 -
system.schema_keyspaces   0,0

INFO  [ScheduledTasks:1] 2019-05-30 14:39:25,182 StatusLogger.java:101 -
system_schema.functions  2,16

INFO  [Service Thread] 2019-05-30 14:39:32,569 StatusLogger.java:101 -
system.sstable_activity 280,10053

WARN  [GossipTasks:1] 2019-05-30 14:39:32,572 FailureDetector.java:288 -
Not marking nodes down due to local pause of 7413014745 > 50

DEBUG [GossipTasks:1] 2019-05-30 14:39:32,578 FailureDetector.java:294 -
Still not marking nodes down due to local pause

INFO  [ScheduledTasks:1] 2019-05-30 14:39:32,577 StatusLogger.java:101 -
virtuoranc.pmcollectionstatus 0,0

INFO  [Service Thread] 2019-05-30 14:39:32,578 StatusLogger.java:101 -
system.batchlog   0,0

INFO  [ScheduledTasks:1] 2019-05-30 14:39:32,579 StatusLogger.java:101 -
virtuoranc.snmp_trapdestination 0,0

INFO  [Service Thread] 2019-05-30 14:39:32,579 StatusLogger.java:101 -
system.schema_columns 0,0

INFO  [ScheduledTasks:1] 2019-05-30 14:39:32,579 StatusLogger.java:101 -
virtuoranc.auditlog   0,0

INFO  [Service Thread] 2019-05-30 14:39:32,580 StatusLogger.java:101 -
system.hints  0,0

INFO  [ScheduledTasks:1] 2019-05-30 14:39:32,580 StatusLogger.java:101 -
virtuoranc.jobproperties  0,0

INFO  [Service Thread] 2019-05-30 14:39:32,580 StatusLogger.java:101 -
system.IndexInfo  0,0

 =


We tried to clean this node and started it and ran nodetool repair -full as
well but it went down in between. Also nodetool command starts taking too
much time to give output after its been 3-4 mins of cassandra startup. And
at one point nodetool gives below error.

nodetool tpstats

nodetool: Failed to connect to '127.0.0.1:7199' - SocketTimeoutException:
'Read timed out'.


Can you please let me know what is happening with this node. Any help is
appreciated.




Regards,
Kunal Vaid


Re: Unpair cassandra datacenters

2019-04-23 Thread Kunal
Thanks Sandeep for your reply. Let me try out the steps you suggested. I
will let you know. Appreciate your help.


Regards,
Kunal Vaid

On Mon, Apr 22, 2019 at 4:18 PM Sandeep Nethi 
wrote:

> Hi Kunal,
>
> The simple solution for this case would be as follows,
>
> 1. Run *Full repair.*
> 2. Add firewall to block network on port 7000(,7001 if ssl enabled)
> between two datacenter nodes.
> 3. Check the status of cassandra cluster from both data centers, each DC
> must show down node status of another DC nodes after the firewall change.
> 4. Change replication factor for all keyspaces on each data center.
> 5. Start decommissioning nodes from each datacenter (Should be removenode
> in this case).
> 6. Update seeds list on each datacenter to local datacenter nodes and
> perform a rolling restart.
>
> Hope this helps, Try to test this scenario on non-prod system first.
>
> Thanks,
> Sandeep
>
> On Tue, Apr 23, 2019 at 11:00 AM Kunal  wrote:
>
>> HI Marc,
>>
>> Appreciate your prompt response.
>>
>> Yes we are starting datacenter B from scratch. We tried using cluster
>> name change on side B and it works but our requirement says we can not
>> change cluster name because during our product's major or patch release,
>> the scripts expect cluster name to be the same.
>> .
>> On datacenter B , we are changing the seeds nodes. On datacenter A , we
>> are changing the seeds nodes in cassandra.yml but that will be picked up
>> during cassandra restart only but we can not have downtime for datacenter
>> A. It has to be up all the time.
>>
>>
>> Regards,
>> Kunal Vaid
>>
>>
>> On Mon, Apr 22, 2019 at 3:49 PM Marc Selwan 
>> wrote:
>>
>>> Hi Kunal,
>>>
>>> Did you edit the cassandra.yaml file in each data center to remove the
>>> seed nodes? On which ever data center is starting from scratch (I think
>>> it's B in your case), you may want to also change the cluster name.
>>>
>>> Best,
>>> *Marc Selwan | *DataStax *| *PM, Server Team *|* *(925) 413-7079* *|*
>>> Twitter <https://twitter.com/MarcSelwan>
>>>
>>> *  Quick links | *DataStax <http://www.datastax.com> *| *Training
>>> <http://www.academy.datastax.com> *| *Documentation
>>> <http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedIntro_r.html>
>>>  *| *Downloads <http://www.datastax.com/download>
>>>
>>>
>>>
>>> On Mon, Apr 22, 2019 at 3:38 PM Kunal  wrote:
>>>
>>>> Hi Friends,
>>>>
>>>> I need small help in unpairing two datacenters.
>>>> We have 2 datacenters (say A and B ) with 3 nodes in each datacenter.
>>>> We want to remove one whole data center (B) (3 nodes) from the other one
>>>> (B). basically, want to unpair both datacenter and want to use them both
>>>> individually.
>>>> We are trying this using nodetool decommission and it is removing the 3
>>>> nodes from B datacenter. But when we are trying to bring up datacenter B to
>>>> use it separately from Datacenter A, it is joining back to datacenter A. We
>>>> noticed in debug.log, nodes from datacenter A keeps looking for nodes in
>>>> datacenter B and getting connection refused error when the nodes of
>>>> datacenter B are down, but as soon as nodes comes back, they are joining to
>>>> the cluster.
>>>> We don't want nodes from datacenter B to join datacenter A once they
>>>> are decommissioned.
>>>>
>>>> Can you please let me know if i am missing anything.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>> Kunal Vaid
>>>>
>>>
>>
>> --
>>
>>
>>
>> Regards,
>> Kunal Vaid
>>
>

-- 



Regards,
Kunal Vaid


Re: Unpair cassandra datacenters

2019-04-22 Thread Kunal
HI Marc,

Appreciate your prompt response.

Yes we are starting datacenter B from scratch. We tried using cluster name
change on side B and it works but our requirement says we can not change
cluster name because during our product's major or patch release, the
scripts expect cluster name to be the same.
.
On datacenter B , we are changing the seeds nodes. On datacenter A , we are
changing the seeds nodes in cassandra.yml but that will be picked up during
cassandra restart only but we can not have downtime for datacenter A. It
has to be up all the time.


Regards,
Kunal Vaid


On Mon, Apr 22, 2019 at 3:49 PM Marc Selwan 
wrote:

> Hi Kunal,
>
> Did you edit the cassandra.yaml file in each data center to remove the
> seed nodes? On which ever data center is starting from scratch (I think
> it's B in your case), you may want to also change the cluster name.
>
> Best,
> *Marc Selwan | *DataStax *| *PM, Server Team *|* *(925) 413-7079* *|*
> Twitter <https://twitter.com/MarcSelwan>
>
> *  Quick links | *DataStax <http://www.datastax.com> *| *Training
> <http://www.academy.datastax.com> *| *Documentation
> <http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedIntro_r.html>
>  *| *Downloads <http://www.datastax.com/download>
>
>
>
> On Mon, Apr 22, 2019 at 3:38 PM Kunal  wrote:
>
>> Hi Friends,
>>
>> I need small help in unpairing two datacenters.
>> We have 2 datacenters (say A and B ) with 3 nodes in each datacenter. We
>> want to remove one whole data center (B) (3 nodes) from the other one (B).
>> basically, want to unpair both datacenter and want to use them both
>> individually.
>> We are trying this using nodetool decommission and it is removing the 3
>> nodes from B datacenter. But when we are trying to bring up datacenter B to
>> use it separately from Datacenter A, it is joining back to datacenter A. We
>> noticed in debug.log, nodes from datacenter A keeps looking for nodes in
>> datacenter B and getting connection refused error when the nodes of
>> datacenter B are down, but as soon as nodes comes back, they are joining to
>> the cluster.
>> We don't want nodes from datacenter B to join datacenter A once they are
>> decommissioned.
>>
>> Can you please let me know if i am missing anything.
>>
>> Thanks in advance.
>>
>> Regards,
>> Kunal Vaid
>>
>

-- 



Regards,
Kunal Vaid


Unpair cassandra datacenters

2019-04-22 Thread Kunal
Hi Friends,

I need small help in unpairing two datacenters.
We have 2 datacenters (say A and B ) with 3 nodes in each datacenter. We
want to remove one whole data center (B) (3 nodes) from the other one (B).
basically, want to unpair both datacenter and want to use them both
individually.
We are trying this using nodetool decommission and it is removing the 3
nodes from B datacenter. But when we are trying to bring up datacenter B to
use it separately from Datacenter A, it is joining back to datacenter A. We
noticed in debug.log, nodes from datacenter A keeps looking for nodes in
datacenter B and getting connection refused error when the nodes of
datacenter B are down, but as soon as nodes comes back, they are joining to
the cluster.
We don't want nodes from datacenter B to join datacenter A once they are
decommissioned.

Can you please let me know if i am missing anything.

Thanks in advance.

Regards,
Kunal Vaid


Re: time tracking for down node for nodetool repair

2019-04-09 Thread Kunal
Thanks everyone for your valuable suggestion.  Really appreciate it


Regards,
Kunal Vaid

On Mon, Apr 8, 2019 at 7:41 PM Nitan Kainth  wrote:

> Valid suggestion. Stick to the plan, avoid downtime of a node more than
> hinted handoff window. OR increase window to a larger value, if you know it
> is going to take longer than current setting
>
>
> Regards,
>
> Nitan
>
> Cell: 510 449 9629
>
> On Apr 8, 2019, at 8:43 PM, Soumya Jena 
> wrote:
>
> Cassandra tracks it and no new hints will be created once the default 3
> hours window is passed  . However , cassandra will not automatically
> trigger a repair if your node is down for more than 3 hours .Default
> settings of 3 hours for hints is defined in cassandra.yaml file . Look for
> "max_hint_window_in_ms" in the cassandra.yaml file. Its configurable .
> Apart from the periodic repair you should start a repair when you bring up
> a node which has missed some writes .
>
> One more thing is  if node is down for long time and missed a lot of
> writes sometimes it may be better to add that as a new fresh node rather
> than adding it and then doing repair .
>
> On Mon, Apr 8, 2019 at 4:49 PM Stefan Miklosovic <
> stefan.mikloso...@instaclustr.com> wrote:
>
>> Ah I see it is the default for hinted handoffs. I was somehow thinking
>> its bigger figure I do not know why :)
>>
>> I would say you should run repairs continuously / periodically so you
>> would not even have to do some thinking about that and it should run
>> in the background in a scheduled manner if possible.
>>
>> Regards
>>
>> On Tue, 9 Apr 2019 at 04:19, Kunal  wrote:
>> >
>> > Hello everyone..
>> >
>> >
>> >
>> > I have a 6 node Cassandra datacenter, 3 nodes on each datacenter. If
>> one of the node goes down and remain down for more than 3 hr, I have to run
>> nodetool repair. Just wanted to ask if Cassandra  automatically tracks the
>> time when one of the Cassandra node goes down or do I need to write code to
>> track the time and run repair when node comes back online after 3 hrs.
>> >
>> >
>> > Thanks in anticipation.
>> >
>> > Regards,
>> > Kunal Vaid
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>> --



Regards,
Kunal Vaid


time tracking for down node for nodetool repair

2019-04-08 Thread Kunal
Hello everyone..



I have a 6 node Cassandra datacenter, 3 nodes on each datacenter. If one of
the node goes down and remain down for more than 3 hr, I have to run
nodetool repair. Just wanted to ask if Cassandra  automatically tracks the
time when one of the Cassandra node goes down or do I need to write code to
track the time and run repair when node comes back online after 3 hrs.

Thanks in anticipation.

Regards,
Kunal Vaid


Re: Two datacenters with one cassandra node in each datacenter

2019-02-07 Thread Kunal
Hi Dinesh,

We have very small setup and size of data is also very small. Max data size
is around 2gb. Latency expectations is around 10-15ms.


Regards,
Kunal

On Wed, Feb 6, 2019 at 11:27 PM dinesh.jo...@yahoo.com.INVALID
 wrote:

> You also want to use Cassandra with a minimum of 3 nodes.
>
> Dinesh
>
>
> On Wednesday, February 6, 2019, 11:26:07 PM PST, dinesh.jo...@yahoo.com <
> dinesh.jo...@yahoo.com> wrote:
>
>
> Hey Kunal,
>
> Can you add more details about the size of data, read/write throughput,
> what are your latency expectations, etc? What do you mean by "performance"
> issue with replication? Without these details it's a bit tough to answer
> your questions.
>
> Dinesh
>
>
> On Wednesday, February 6, 2019, 3:47:05 PM PST, Kunal <
> kunal.v...@gmail.com> wrote:
>
>
> HI All,
>
> I need some recommendation on using two datacenters with one node in each
> datacenter.
>
> In our organization, We are trying to have two cassandra dataceters with
> only 1 node on each side. From the preliminary investigation, I see
> replication is happening but I want to know if we can use this deployment
> in production? Will there be any performance issue with replication ?
>
> We have already setup 2 datacenters with one node on each datacenter and
> replication is working fine.
>
> Can you please let me know if this kind of setup is recommended for
> production deployment.
> Thanks in anticipation.
>
> Regards,
> Kunal Vaid
>


-- 



Regards,
Kunal Vaid


Two datacenters with one cassandra node in each datacenter

2019-02-06 Thread Kunal
HI All,

I need some recommendation on using two datacenters with one node in each
datacenter.

In our organization, We are trying to have two cassandra dataceters with
only 1 node on each side. From the preliminary investigation, I see
replication is happening but I want to know if we can use this deployment
in production? Will there be any performance issue with replication ?

We have already setup 2 datacenters with one node on each datacenter and
replication is working fine.

Can you please let me know if this kind of setup is recommended for
production deployment.
Thanks in anticipation.

Regards,
Kunal Vaid


RE: system.size_estimates - safe to remove sstables?

2018-03-12 Thread Kunal Gangakhedkar
No, this is a different cluster.

Kunal

On 13-Mar-2018 6:27 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid>
wrote:

Kunal,



Is  this the GCE cluster you are speaking of in the “Adding new DC?” thread?



Kenneth Brotman



*From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
*Sent:* Sunday, March 11, 2018 2:18 PM
*To:* user@cassandra.apache.org
*Subject:* Re: system.size_estimates - safe to remove sstables?



Finally, got a chance to work on it over the weekend.

It worked as advertised. :)



Thanks a lot, Chris.


Kunal



On 8 March 2018 at 10:47, Kunal Gangakhedkar <kgangakhed...@gmail.com>
wrote:

Thanks a lot, Chris.



Will try it today/tomorrow and update here.



Thanks,

Kunal



On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote:

While its off you can delete the files in the directory yeah



Chris





On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
wrote:



Hi Chris,



I checked for snapshots and backups - none found.

Also, we're not using opscenter, hadoop or spark or any such tool.



So, do you think we can just remove the cf and restart the service?



Thanks,

Kunal



On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:

Any chance space used by snapshots? What files exist there that are taking
up space?

> On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
wrote:
>

> Hi all,
>
> I have a 2-node cluster running cassandra 2.1.18.
> One of the nodes has run out of disk space and died - almost all of it
shows up as occupied by size_estimates CF.
> Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh'
output.
>
> This is while the other node is chugging along - shows only 25MiB
consumed by size_estimates (du -sh output).
>
> Any idea why this descripancy?
> Is it safe to remove the size_estimates sstables from the affected node
and restart the service?
>
> Thanks,
> Kunal

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
<user-unsubscribe@cassandra.apacheorg>
For additional commands, e-mail: user-h...@cassandra.apache.org


Re: [EXTERNAL] RE: Adding new DC?

2018-03-12 Thread Kunal Gangakhedkar
Yes, that's correct. The customer wants us to migrate the cassandra setup
in their AWS account.

Thanks,
Kunal

On 13 March 2018 at 04:56, Kenneth Brotman <kenbrot...@yahoo.com.invalid>
wrote:

> I didn’t understand something.  Are you saying you are using one data
> center on Google and one on Amazon?
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
> *Sent:* Monday, March 12, 2018 4:24 PM
> *To:* user@cassandra.apache.org
> *Cc:* Nikhil Soman
> *Subject:* Re: [EXTERNAL] RE: Adding new DC?
>
>
>
>
>
> On 13 March 2018 at 03:28, Kenneth Brotman <kenbrot...@yahoo.com.invalid>
> wrote:
>
> You can’t migrate and upgrade at the same time perhaps but you could do
> one and then the other so as to end up on new version.  I’m guessing it’s
> an error in the yaml file or a port not open.  Is there any good reason for
> a production cluster to still be on version 2.1x?
>
>
>
> I'm not trying to migrate AND upgrade at the same time. However, the apt
> repo shows only 2.120 as the available version.
>
> This is the output from the new node in AWS
>
>
>
> ubuntu@ip-10-0-43-213:*~*$ apt-cache policy cassandra
> cassandra:
>  Installed: 2.1.20
>  Candidate: 2.1.20
>  Version table:
> *** 2.1.20 500
>500 http://www.apache.org/dist/cassandra/debian 21x/main amd64
> Packages
>100 /var/lib/dpkg/status
>
> Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node
> into GCE nodes.
>
> As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE
> firewall for the public IP of the AWS instance.
>
>
>
> I mentioned earlier - there are some differences in the column types - for
> example, date (>= 2.2) vs. timestamp (2.1.x)
>
> The application has not been updated yet.
>
> Hence sticking to 2.1.x for now.
>
>
>
> And, so far, 2.1.x has been serving the purpose.
>
> Kunal
>
>
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Durity, Sean R [mailto:sean_r_dur...@homedepot.com]
> *Sent:* Monday, March 12, 2018 11:36 AM
> *To:* user@cassandra.apache.org
> *Subject:* RE: [EXTERNAL] RE: Adding new DC?
>
>
>
> You cannot migrate and upgrade at the same time across major versions.
> Streaming is (usually) not compatible between versions.
>
>
>
> As to the migration question, I would expect that you may need to put the
> external-facing ip addresses in several places in the cassandra.yaml file.
> And, yes, it would require a restart. Why is a non-restart more desirable?
> Most Cassandra changes require a restart, but you can do a rolling restart
> and not impact your application. This is fairly normal admin work and
> can/should be automated.
>
>
>
> How large is the cluster to migrate (# of nodes and size of data). The
> preferred method might depend on how much data needs to move. Is any
> application outage acceptable?
>
>
>
> Sean Durity
>
> lord of the (C*) rings (Staff Systems Engineer – Cassandra)
>
> *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com
> <kgangakhed...@gmail.com>]
> *Sent:* Sunday, March 11, 2018 10:20 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] RE: Adding new DC?
>
>
>
> Hi Kenneth,
>
>
>
> Replies inline below.
>
>
>
> On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid>
> wrote:
>
> Hi Kunal,
>
>
>
> That version of Cassandra is too far before me so I’ll let others answer.
> I was wonder why you wouldn’t want to end up on 3.0x if you’re going
> through all the trouble of migrating anyway?
>
>
>
>
>
> Application side constraints - some data types are different between 2.1.x
> and 3.x (for example, date vs. timestamp).
>
>
>
> Besides, this is production setup - so, cannot take risk
>
> Are both data centers in the same region on AWS?  Can you provide yaml
> file for us to see?
>
>
>
>
>
> No, they are in different regions - GCE setup is in us-east while AWS
> setup is in Asia-south (Mumbai)
>
>
>
> Thanks,
>
> Kunal
>
> Kenneth Brotman
>
>
>
> *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
> *Sent:* Sunday, March 11, 2018 2:32 PM
> *To:* user@cassandra.apache.org
> *Subject:* Adding new DC?
>
>
>
> Hi all,
>
>
>
> We currently have a cluster in GCE for one of the customers.
>
> They want it to be migrated to AWS.
>
>
>
> I have setup one node in AWS to join into the cluster by following:
>
> https://docsdatastax.com/en/cassandra/2.1/cassandra/
> operations/ops_add_dc_to_cluster_t.html
> <https:

Re: [EXTERNAL] RE: Adding new DC?

2018-03-12 Thread Kunal Gangakhedkar
On 13 March 2018 at 04:54, Kenneth Brotman <kenbrot...@yahoo.com.invalid>
wrote:

> Kunal,
>
>
>
> Please provide the following setting from the yaml files you  are using:
>
>
>
> seeds:
>

In GCE: seeds: "10.142.14.27"
In AWS (new node being added): seeds:
"35.196.96.247,35.227.127.245,35.196.241.232" (these are the public IP
addresses of 3 nodes from GCE)

 I have verified that I am able to do cqlsh from the AWS instance to all 3
ip addresses.


> listen_address:
>

We use the listen_interface setting instead of listen_address.

In GCE: listen_interface: eth0 (running ubuntu 14.04 LTS)
In AWS: listen_interface: ens3 (running ubuntu 16.04 LTS)


> broadcast_address:
>

I tried setting broadcast_address to one instance in GCE: broadcast_address:
35.196.96.247

In AWS: broadcast_address: 13.127.89.251 (this is the public/elastic IP of
the node in AWS)

rpc_address:
>

Like listen_address, we use rpc_interface.
In GCE: rpc_interface:  eth0
In AWS: rpc_interface:  ens3


> endpoint_snitch:
>

In both setups, we currently use GossipingPropertyFileSnitch.
The cassandra-rackdc.properties files from both setups:
GCE:
dc=DC1
rack=RAC1

AWS:
dc=DC2
rack=RAC1



> auto_bootstrap:
>

When the google cloud instances started up, we hadn't set this explicitly -
so, they started off with default value (auto_bootstrap: true)
However, as outlined in the datastax doc for adding new dc, I had added
'auto_bootstrap: false' to the google cloud instances (not restarted the
service as per the doc).

In the AWS instance, I had added 'auto_bootstrap: false' - the doc says we
need to do "nodetool rebuild" and hence no automatic bootstrapping.
But, haven't gotten to that step yet.

Thanks,
Kunal


>
> Kenneth Brotman
>
>
>
> *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
> *Sent:* Monday, March 12, 2018 4:13 PM
> *To:* user@cassandra.apache.org
> *Cc:* Nikhil Soman
> *Subject:* Re: [EXTERNAL] RE: Adding new DC?
>
>
>
>
>
> On 13 March 2018 at 00:06, Durity, Sean R <sean_r_dur...@homedepot.com>
> wrote:
>
> You cannot migrate and upgrade at the same time across major versions.
> Streaming is (usually) not compatible between versions.
>
>
>
> I'm not trying to upgrade as of now - first priority is the migration.
>
> We can look at version upgrade later on.
>
>
>
>
>
> As to the migration question, I would expect that you may need to put the
> external-facing ip addresses in several places in the cassandra.yaml file.
> And, yes, it would require a restart. Why is a non-restart more desirable?
> Most Cassandra changes require a restart, but you can do a rolling restart
> and not impact your application. This is fairly normal admin work and
> can/should be automated.
>
>
>
> I just tried setting the broadcast_address in one of the instances in GCE
> to its public IP and restarted the service.
>
> However, it now shows all other nodes (in GCE) as DN in nodetool status
> output and the other nodes also report this node as DN with its
> internal/private IP address.
>
>
>
> I also tried setting the broadcast_rpc_address to the internal/private IP
> address - still the same.
>
>
>
>
>
> How large is the cluster to migrate (# of nodes and size of data). The
> preferred method might depend on how much data needs to move. Is any
> application outage acceptable?
>
>
>
> No. of nodes: 5
>
> RF: 3
>
> Data size (as reported by the load factor in nodetool status output):
> ~30GB per node
>
>
>
> Thanks,
> Kunal
>
>
>
>
>
> Sean Durity
>
> lord of the (C*) rings (Staff Systems Engineer – Cassandra)
>
> *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
> *Sent:* Sunday, March 11, 2018 10:20 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] RE: Adding new DC?
>
>
>
> Hi Kenneth,
>
>
>
> Replies inline below.
>
>
>
> On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid>
> wrote:
>
> Hi Kunal,
>
>
>
> That version of Cassandra is too far before me so I’ll let others answer.
> I was wonder why you wouldn’t want to end up on 3.0x if you’re going
> through all the trouble of migrating anyway?
>
>
>
>
>
> Application side constraints - some data types are different between 2.1.x
> and 3.x (for example, date vs. timestamp).
>
>
>
> Besides, this is production setup - so, cannot take risk.
>
> Are both data centers in the same region on AWS?  Can you provide yaml
> file for us to see?
>
>
>
>
>
> No, they are in different regions - GCE setup is in us-east while AWS
> setup is in Asia-south (Mumbai)
>
>
>
&

Re: [EXTERNAL] RE: Adding new DC?

2018-03-12 Thread Kunal Gangakhedkar
On 13 March 2018 at 03:28, Kenneth Brotman <kenbrot...@yahoo.com.invalid>
wrote:

> You can’t migrate and upgrade at the same time perhaps but you could do
> one and then the other so as to end up on new version.  I’m guessing it’s
> an error in the yaml file or a port not open.  Is there any good reason for
> a production cluster to still be on version 2.1x?
>

I'm not trying to migrate AND upgrade at the same time. However, the apt
repo shows only 2.1.20 as the available version.
This is the output from the new node in AWS

ubuntu@ip-10-0-43-213:~$ apt-cache policy cassandra
cassandra:
 Installed: 2.1.20
 Candidate: 2.1.20
 Version table:
*** 2.1.20 500
   500 http://www.apache.org/dist/cassandra/debian 21x/main amd64
Packages
   100 /var/lib/dpkg/status

Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node
into GCE nodes.
As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE
firewall for the public IP of the AWS instance.

I mentioned earlier - there are some differences in the column types - for
example, date (>= 2.2) vs. timestamp (2.1.x)
The application has not been updated yet.
Hence sticking to 2.1.x for now.

And, so far, 2.1.x has been serving the purpose.

Kunal


>
> Kenneth Brotman
>
>
>
> *From:* Durity, Sean R [mailto:sean_r_dur...@homedepot.com]
> *Sent:* Monday, March 12, 2018 11:36 AM
> *To:* user@cassandra.apache.org
> *Subject:* RE: [EXTERNAL] RE: Adding new DC?
>
>
>
> You cannot migrate and upgrade at the same time across major versions.
> Streaming is (usually) not compatible between versions.
>
>
>
> As to the migration question, I would expect that you may need to put the
> external-facing ip addresses in several places in the cassandra.yaml file.
> And, yes, it would require a restart. Why is a non-restart more desirable?
> Most Cassandra changes require a restart, but you can do a rolling restart
> and not impact your application. This is fairly normal admin work and
> can/should be automated.
>
>
>
> How large is the cluster to migrate (# of nodes and size of data). The
> preferred method might depend on how much data needs to move. Is any
> application outage acceptable?
>
>
>
> Sean Durity
>
> lord of the (C*) rings (Staff Systems Engineer – Cassandra)
>
> *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com
> <kgangakhed...@gmail.com>]
> *Sent:* Sunday, March 11, 2018 10:20 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] RE: Adding new DC?
>
>
>
> Hi Kenneth,
>
>
>
> Replies inline below.
>
>
>
> On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid>
> wrote:
>
> Hi Kunal,
>
>
>
> That version of Cassandra is too far before me so I’ll let others answer.
> I was wonder why you wouldn’t want to end up on 3.0x if you’re going
> through all the trouble of migrating anyway?
>
>
>
>
>
> Application side constraints - some data types are different between 2.1.x
> and 3.x (for example, date vs. timestamp).
>
>
>
> Besides, this is production setup - so, cannot take risk.
>
> Are both data centers in the same region on AWS?  Can you provide yaml
> file for us to see?
>
>
>
>
>
> No, they are in different regions - GCE setup is in us-east while AWS
> setup is in Asia-south (Mumbai)
>
>
>
> Thanks,
>
> Kunal
>
> Kenneth Brotman
>
>
>
> *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
> *Sent:* Sunday, March 11, 2018 2:32 PM
> *To:* user@cassandra.apache.org
> *Subject:* Adding new DC?
>
>
>
> Hi all,
>
>
>
> We currently have a cluster in GCE for one of the customers.
>
> They want it to be migrated to AWS.
>
>
>
> I have setup one node in AWS to join into the cluster by following:
>
> https://docs.datastax.com/en/cassandra/2.1/cassandra/
> operations/ops_add_dc_to_cluster_t.html
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk=>
>
>
>
> Will add more nodes once the first one joins successfully.
>
>
>
> The node in AWS has an elastic IP - which is white-listed for ports
> 7000-7001, 7199, 9042 in GCE firewall.
>
>
>
> The snitch is set to GossipingPropertyFileSnitch. The GCE setup has
> dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2.
>
>
>
> When I start cassandra service on the AWS instance, I see the version
> handshake msgs in the logs trying to connect to the public IPs o

Re: [EXTERNAL] RE: Adding new DC?

2018-03-12 Thread Kunal Gangakhedkar
On 13 March 2018 at 00:06, Durity, Sean R <sean_r_dur...@homedepot.com>
wrote:

> You cannot migrate and upgrade at the same time across major versions.
> Streaming is (usually) not compatible between versions.
>

I'm not trying to upgrade as of now - first priority is the migration.
We can look at version upgrade later on.


>
>
> As to the migration question, I would expect that you may need to put the
> external-facing ip addresses in several places in the cassandra.yaml file.
> And, yes, it would require a restart. Why is a non-restart more desirable?
> Most Cassandra changes require a restart, but you can do a rolling restart
> and not impact your application. This is fairly normal admin work and
> can/should be automated.
>

I just tried setting the broadcast_address in one of the instances in GCE
to its public IP and restarted the service.
However, it now shows all other nodes (in GCE) as DN in nodetool status
output and the other nodes also report this node as DN with its
internal/private IP address.

I also tried setting the broadcast_rpc_address to the internal/private IP
address - still the same.


>
>
> How large is the cluster to migrate (# of nodes and size of data). The
> preferred method might depend on how much data needs to move. Is any
> application outage acceptable?
>

No. of nodes: 5
RF: 3
Data size (as reported by the load factor in nodetool status output): ~30GB
per node

Thanks,
Kunal


>
>
> Sean Durity
>
> lord of the (C*) rings (Staff Systems Engineer – Cassandra)
>
> *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
> *Sent:* Sunday, March 11, 2018 10:20 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] RE: Adding new DC?
>
>
>
> Hi Kenneth,
>
>
>
> Replies inline below.
>
>
>
> On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid>
> wrote:
>
> Hi Kunal,
>
>
>
> That version of Cassandra is too far before me so I’ll let others answer.
> I was wonder why you wouldn’t want to end up on 3.0x if you’re going
> through all the trouble of migrating anyway?
>
>
>
>
>
> Application side constraints - some data types are different between 2.1.x
> and 3.x (for example, date vs. timestamp).
>
>
>
> Besides, this is production setup - so, cannot take risk.
>
> Are both data centers in the same region on AWS?  Can you provide yaml
> file for us to see?
>
>
>
>
>
> No, they are in different regions - GCE setup is in us-east while AWS
> setup is in Asia-south (Mumbai)
>
>
>
> Thanks,
>
> Kunal
>
> Kenneth Brotman
>
>
>
> *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
> *Sent:* Sunday, March 11, 2018 2:32 PM
> *To:* user@cassandra.apache.org
> *Subject:* Adding new DC?
>
>
>
> Hi all,
>
>
>
> We currently have a cluster in GCE for one of the customers.
>
> They want it to be migrated to AWS.
>
>
>
> I have setup one node in AWS to join into the cluster by following:
>
> https://docs.datastax.com/en/cassandra/2.1/cassandra/
> operations/ops_add_dc_to_cluster_t.html
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk=>
>
>
>
> Will add more nodes once the first one joins successfully.
>
>
>
> The node in AWS has an elastic IP - which is white-listed for ports
> 7000-7001, 7199, 9042 in GCE firewall.
>
>
>
> The snitch is set to GossipingPropertyFileSnitch. The GCE setup has
> dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2.
>
>
>
> When I start cassandra service on the AWS instance, I see the version
> handshake msgs in the logs trying to connect to the public IPs of the GCE
> nodes:
>
> OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx
>
> However, nodetool status output on both sides don't show the other side at
> all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS
> setup doesn't show old DC (dc=DC1).
>
>
>
> In cassandra.yaml file, I'm only using listen_interface and rpc_interface
> settings - no explicit IP addresses used - so, ends up using the internal
> private IP ranges.
>
>
>
> Do I need to explicitly add the broadcast_address? for both side?
>
> Would that require restarting of cassandra service on GCE side? Or is it
> possible to change that setting on-the-fly without a restart?
>
>
>
> I would prefer a non-restart option.
>
>
>
> PS: The ca

RE: Adding new DC?

2018-03-11 Thread Kunal Gangakhedkar
Hi Kenneth,

Replies inline below.

On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid>
wrote:

Hi Kunal,



That version of Cassandra is too far before me so I’ll let others answer.
I was wonder why you wouldn’t want to end up on 3.0x if you’re going
through all the trouble of migrating anyway?




Application side constraints - some data types are different between 2.1.x
and 3.x (for example, date vs. timestamp).

Besides, this is production setup - so, cannot take risk.

Are both data centers in the same region on AWS?  Can you provide yaml file
for us to see?




No, they are in different regions - GCE setup is in us-east while AWS setup
is in Asia-south (Mumbai)

Thanks,
Kunal

Kenneth Brotman



*From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
*Sent:* Sunday, March 11, 2018 2:32 PM
*To:* user@cassandra.apache.org
*Subject:* Adding new DC?



Hi all,



We currently have a cluster in GCE for one of the customers.

They want it to be migrated to AWS.



I have setup one node in AWS to join into the cluster by following:

https://docs.datastax.com/en/cassandra/2.1/cassandra/
operations/ops_add_dc_to_cluster_t.html



Will add more nodes once the first one joins successfully.



The node in AWS has an elastic IP - which is white-listed for ports
7000-7001, 7199, 9042 in GCE firewall.



The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1,
rack=RAC1 while on AWS, I changed the DC to dc=DC2.



When I start cassandra service on the AWS instance, I see the version
handshake msgs in the logs trying to connect to the public IPs of the GCE
nodes:

OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx

However, nodetool status output on both sides don't show the other side at
all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS
setup doesn't show old DC (dc=DC1).



In cassandra.yaml file, I'm only using listen_interface and rpc_interface
settings - no explicit IP addresses used - so, ends up using the internal
private IP ranges.



Do I need to explicitly add the broadcast_address? for both side?

Would that require restarting of cassandra service on GCE side? Or is it
possible to change that setting on-the-fly without a restart?



I would prefer a non-restart option.



PS: The cassandra version running in GCE is 2.1.18 while the new node setup
in AWS is running 2.1.20 - just in case if that's relevant



Thanks,

Kunal


Adding new DC?

2018-03-11 Thread Kunal Gangakhedkar
Hi all,

We currently have a cluster in GCE for one of the customers.
They want it to be migrated to AWS.

I have setup one node in AWS to join into the cluster by following:
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html

Will add more nodes once the first one joins successfully.

The node in AWS has an elastic IP - which is white-listed for ports
7000-7001, 7199, 9042 in GCE firewall.

The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1,
rack=RAC1 while on AWS, I changed the DC to dc=DC2.

When I start cassandra service on the AWS instance, I see the version
handshake msgs in the logs trying to connect to the public IPs of the GCE
nodes:
OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx

However, nodetool status output on both sides don't show the other side at
all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS
setup doesn't show old DC (dc=DC1).

In cassandra.yaml file, I'm only using listen_interface and rpc_interface
settings - no explicit IP addresses used - so, ends up using the internal
private IP ranges.

Do I need to explicitly add the broadcast_address? for both side?
Would that require restarting of cassandra service on GCE side? Or is it
possible to change that setting on-the-fly without a restart?

I would prefer a non-restart option.

PS: The cassandra version running in GCE is 2.1.18 while the new node setup
in AWS is running 2.1.20 - just in case if that's relevant.

Thanks,
Kunal


Re: system.size_estimates - safe to remove sstables?

2018-03-11 Thread Kunal Gangakhedkar
Finally, got a chance to work on it over the weekend.
It worked as advertised. :)

Thanks a lot, Chris.

Kunal

On 8 March 2018 at 10:47, Kunal Gangakhedkar <kgangakhed...@gmail.com>
wrote:

> Thanks a lot, Chris.
>
> Will try it today/tomorrow and update here.
>
> Thanks,
> Kunal
>
> On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote:
>
>> While its off you can delete the files in the directory yeah
>>
>> Chris
>>
>>
>> On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
>> wrote:
>>
>> Hi Chris,
>>
>> I checked for snapshots and backups - none found.
>> Also, we're not using opscenter, hadoop or spark or any such tool.
>>
>> So, do you think we can just remove the cf and restart the service?
>>
>> Thanks,
>> Kunal
>>
>> On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:
>>
>>> Any chance space used by snapshots? What files exist there that are
>>> taking up space?
>>>
>>> > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <
>>> kgangakhed...@gmail.com> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I have a 2-node cluster running cassandra 2.1.18.
>>> > One of the nodes has run out of disk space and died - almost all of it
>>> shows up as occupied by size_estimates CF.
>>> > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du
>>> -sh' output.
>>> >
>>> > This is while the other node is chugging along - shows only 25MiB
>>> consumed by size_estimates (du -sh output).
>>> >
>>> > Any idea why this descripancy?
>>> > Is it safe to remove the size_estimates sstables from the affected
>>> node and restart the service?
>>> >
>>> > Thanks,
>>> > Kunal
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>
>>
>


Re: system.size_estimates - safe to remove sstables?

2018-03-07 Thread Kunal Gangakhedkar
Thanks a lot, Chris.

Will try it today/tomorrow and update here.

Thanks,
Kunal

On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote:

> While its off you can delete the files in the directory yeah
>
> Chris
>
>
> On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
> wrote:
>
> Hi Chris,
>
> I checked for snapshots and backups - none found.
> Also, we're not using opscenter, hadoop or spark or any such tool.
>
> So, do you think we can just remove the cf and restart the service?
>
> Thanks,
> Kunal
>
> On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:
>
>> Any chance space used by snapshots? What files exist there that are
>> taking up space?
>>
>> > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
>> wrote:
>> >
>> > Hi all,
>> >
>> > I have a 2-node cluster running cassandra 2.1.18.
>> > One of the nodes has run out of disk space and died - almost all of it
>> shows up as occupied by size_estimates CF.
>> > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du
>> -sh' output.
>> >
>> > This is while the other node is chugging along - shows only 25MiB
>> consumed by size_estimates (du -sh output).
>> >
>> > Any idea why this descripancy?
>> > Is it safe to remove the size_estimates sstables from the affected node
>> and restart the service?
>> >
>> > Thanks,
>> > Kunal
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>
>


Re: system.size_estimates - safe to remove sstables?

2018-03-06 Thread Kunal Gangakhedkar
Hi Chris,

I checked for snapshots and backups - none found.
Also, we're not using opscenter, hadoop or spark or any such tool.

So, do you think we can just remove the cf and restart the service?

Thanks,
Kunal

On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:

> Any chance space used by snapshots? What files exist there that are taking
> up space?
>
> > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
> wrote:
> >
> > Hi all,
> >
> > I have a 2-node cluster running cassandra 2.1.18.
> > One of the nodes has run out of disk space and died - almost all of it
> shows up as occupied by size_estimates CF.
> > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh'
> output.
> >
> > This is while the other node is chugging along - shows only 25MiB
> consumed by size_estimates (du -sh output).
> >
> > Any idea why this descripancy?
> > Is it safe to remove the size_estimates sstables from the affected node
> and restart the service?
> >
> > Thanks,
> > Kunal
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


system.size_estimates - safe to remove sstables?

2018-03-04 Thread Kunal Gangakhedkar
Hi all,

I have a 2-node cluster running cassandra 2.1.18.
One of the nodes has run out of disk space and died - almost all of it
shows up as occupied by size_estimates CF.
Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh'
output.

This is while the other node is chugging along - shows only 25MiB consumed
by size_estimates (du -sh output).

Any idea why this descripancy?
Is it safe to remove the size_estimates sstables from the affected node and
restart the service?

Thanks,
Kunal


Re: TRUNCATE on a disk almost full - possible?

2017-04-22 Thread Kunal Gangakhedkar
Great, thanks a lot for the help, guys.

I just did the truncation + clearsnapshot just now - worked smoothly.. :)
Freed up 400GB, yay \o/

Really appreciate your help.

Thanks once again.

Kunal

On 21 April 2017 at 15:04, Nicolas Guyomar <nicolas.guyo...@gmail.com>
wrote:

> Hi Kunal,
>
> Timeout usually occured in the client (eg cqlsh), it does not mean that
> the truncate operation is interrupted.
>
> Have you checked that you have no old snapshot (automatic snaphost for
> instance) that you could get rid off to get some space back ?
>
> On 21 April 2017 at 11:27, benjamin roth <brs...@gmail.com> wrote:
>
>> Truncate needs no space. It just creates a hard link of all affected
>> SSTables under the corresponding -SNAPSHOT dir (at least with default
>> settings) and then removes the SSTables.
>> Also this operation should be rather fast as it is mostly a file-deletion
>> process with some metadata updates.
>>
>> 2017-04-21 11:21 GMT+02:00 Kunal Gangakhedkar <kgangakhed...@gmail.com>:
>>
>>> Hi all,
>>>
>>> We have a CF that's grown too large - it's not getting actively used in
>>> the app right now.
>>> The on-disk size of the . directory is ~407GB and I have only
>>> ~40GB free left on the disk.
>>>
>>> I understand that if I trigger a TRUNCATE on this CF, cassandra will try
>>> to take snapshot.
>>> My question:
>>> Is the ~40GB enough to safely truncate this table?
>>>
>>> I will manually remove the . directory once the truncate is
>>> completed.
>>>
>>> Also, while browsing through earlier msgs regarding truncate, I noticed
>>> that it's possible to get OperationTimedOut
>>> <http://www.mail-archive.com/user@cassandra.apache.org/msg48958.html>
>>> exception. Does that stop the truncate operation?
>>>
>>> Is there any other safe way to clean up the CF?
>>>
>>> Thanks,
>>> Kunal
>>>
>>
>>
>


TRUNCATE on a disk almost full - possible?

2017-04-21 Thread Kunal Gangakhedkar
Hi all,

We have a CF that's grown too large - it's not getting actively used in the
app right now.
The on-disk size of the . directory is ~407GB and I have only ~40GB
free left on the disk.

I understand that if I trigger a TRUNCATE on this CF, cassandra will try to
take snapshot.
My question:
Is the ~40GB enough to safely truncate this table?

I will manually remove the . directory once the truncate is
completed.

Also, while browsing through earlier msgs regarding truncate, I noticed
that it's possible to get OperationTimedOut
<http://www.mail-archive.com/user@cassandra.apache.org/msg48958.html>
exception. Does that stop the truncate operation?

Is there any other safe way to clean up the CF?

Thanks,
Kunal


Re: Backups eating up disk space

2017-02-27 Thread Kunal Gangakhedkar
Hi all,

Is it safe to delete the backup folders from various CFs from 'system'
keyspace too?
I seem to have missed them in the last cleanup - and now, the
size_estimates and compactions_in_progress seem to have grown large ( >200G
and ~6G respectively).

Can I remove them too?

Thanks,
Kunal

On 13 January 2017 at 18:30, Kunal Gangakhedkar <kgangakhed...@gmail.com>
wrote:

> Great, thanks a lot to all for the help :)
>
> I finally took the dive and went with Razi's suggestions.
> In summary, this is what I did:
>
>- turn off incremental backups on each of the nodes in rolling fashion
>- remove the 'backups' directory from each keyspace on each node.
>
> This ended up freeing up almost 350GB on each node - yay :)
>
> Again, thanks a lot for the help, guys.
>
> Kunal
>
> On 12 January 2017 at 21:15, Khaja, Raziuddin (NIH/NLM/NCBI) [C] <
> raziuddin.kh...@nih.gov> wrote:
>
>> snapshots are slightly different than backups.
>>
>>
>>
>> In my explanation of the hardlinks created in the backups folder, notice
>> that compacted sstables, never end up in the backups folder.
>>
>>
>>
>> On the other hand, a snapshot is meant to represent the data at a
>> particular moment in time. Thus, the snapshots directory contains hardlinks
>> to all active sstables at the time the snapshot was taken, which would
>> include: compacted sstables; and any sstables from memtable flush or
>> streamed from other nodes that both exist in the table directory and the
>> backups directory.
>>
>>
>>
>> So, that would be the difference between snapshots and backups.
>>
>>
>>
>> Best regards,
>>
>> -Razi
>>
>>
>>
>>
>>
>> *From: *Alain RODRIGUEZ <arodr...@gmail.com>
>> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>> *Date: *Thursday, January 12, 2017 at 9:16 AM
>>
>> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>> *Subject: *Re: Backups eating up disk space
>>
>>
>>
>> My 2 cents,
>>
>>
>>
>> As I mentioned earlier, we're not currently using snapshots - it's only
>> the backups that are bothering me right now.
>>
>>
>>
>> I believe backups folder is just the new name for the previously called
>> snapshots folder. But I can be completely wrong, I haven't played that much
>> with snapshots in new versions yet.
>>
>>
>>
>> Anyway, some operations in Apache Cassandra can trigger a snapshot:
>>
>>
>>
>> - Repair (when not using parallel option but sequential repairs instead)
>>
>> - Truncating a table (by default)
>>
>> - Dropping a table (by default)
>>
>> - Maybe other I can't think of... ?
>>
>>
>>
>> If you want to clean space but still keep a backup you can run:
>>
>>
>>
>> "nodetool clearsnapshots"
>>
>> "nodetool snapshot "
>>
>>
>>
>> This way and for a while, data won't be taking space as old files will be
>> cleaned and new files will be only hardlinks as detailed above. Then you
>> might want to work at a proper backup policy, probably implying getting
>> data out of production server (a lot of people uses S3 or similar
>> services). Or just do that from time to time, meaning you only keep a
>> backup and disk space behaviour will be hard to predict.
>>
>>
>>
>> C*heers,
>>
>> ---
>>
>> Alain Rodriguez - @arodream - al...@thelastpickle.com
>>
>> France
>>
>>
>>
>> The Last Pickle - Apache Cassandra Consulting
>>
>> http://www.thelastpickle.com
>>
>>
>>
>> 2017-01-12 6:42 GMT+01:00 Prasenjit Sarkar <prasenjit.sar...@datos.io>:
>>
>> Hi Kunal,
>>
>>
>>
>> Razi's post does give a very lucid description of how cassandra manages
>> the hard links inside the backup directory.
>>
>>
>>
>> Where it needs clarification is the following:
>>
>> --> incremental backups is a system wide setting and so its an all or
>> nothing approach
>>
>>
>>
>> --> as multiple people have stated, incremental backups do not create
>> hard links to compacted sstables. however, this can bloat the size of your
>> backups
>>
>>
>>
>> --> again as stated, it is a general industry practice to place backups
>> in a different secondary storage location than the main production site. So
>> b

Re: Backups eating up disk space

2017-01-13 Thread Kunal Gangakhedkar
Great, thanks a lot to all for the help :)

I finally took the dive and went with Razi's suggestions.
In summary, this is what I did:

   - turn off incremental backups on each of the nodes in rolling fashion
   - remove the 'backups' directory from each keyspace on each node.

This ended up freeing up almost 350GB on each node - yay :)

Again, thanks a lot for the help, guys.

Kunal

On 12 January 2017 at 21:15, Khaja, Raziuddin (NIH/NLM/NCBI) [C] <
raziuddin.kh...@nih.gov> wrote:

> snapshots are slightly different than backups.
>
>
>
> In my explanation of the hardlinks created in the backups folder, notice
> that compacted sstables, never end up in the backups folder.
>
>
>
> On the other hand, a snapshot is meant to represent the data at a
> particular moment in time. Thus, the snapshots directory contains hardlinks
> to all active sstables at the time the snapshot was taken, which would
> include: compacted sstables; and any sstables from memtable flush or
> streamed from other nodes that both exist in the table directory and the
> backups directory.
>
>
>
> So, that would be the difference between snapshots and backups.
>
>
>
> Best regards,
>
> -Razi
>
>
>
>
>
> *From: *Alain RODRIGUEZ <arodr...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Thursday, January 12, 2017 at 9:16 AM
>
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Re: Backups eating up disk space
>
>
>
> My 2 cents,
>
>
>
> As I mentioned earlier, we're not currently using snapshots - it's only
> the backups that are bothering me right now.
>
>
>
> I believe backups folder is just the new name for the previously called
> snapshots folder. But I can be completely wrong, I haven't played that much
> with snapshots in new versions yet.
>
>
>
> Anyway, some operations in Apache Cassandra can trigger a snapshot:
>
>
>
> - Repair (when not using parallel option but sequential repairs instead)
>
> - Truncating a table (by default)
>
> - Dropping a table (by default)
>
> - Maybe other I can't think of... ?
>
>
>
> If you want to clean space but still keep a backup you can run:
>
>
>
> "nodetool clearsnapshots"
>
> "nodetool snapshot "
>
>
>
> This way and for a while, data won't be taking space as old files will be
> cleaned and new files will be only hardlinks as detailed above. Then you
> might want to work at a proper backup policy, probably implying getting
> data out of production server (a lot of people uses S3 or similar
> services). Or just do that from time to time, meaning you only keep a
> backup and disk space behaviour will be hard to predict.
>
>
>
> C*heers,
>
> ---
>
> Alain Rodriguez - @arodream - al...@thelastpickle.com
>
> France
>
>
>
> The Last Pickle - Apache Cassandra Consulting
>
> http://www.thelastpickle.com
>
>
>
> 2017-01-12 6:42 GMT+01:00 Prasenjit Sarkar <prasenjit.sar...@datos.io>:
>
> Hi Kunal,
>
>
>
> Razi's post does give a very lucid description of how cassandra manages
> the hard links inside the backup directory.
>
>
>
> Where it needs clarification is the following:
>
> --> incremental backups is a system wide setting and so its an all or
> nothing approach
>
>
>
> --> as multiple people have stated, incremental backups do not create hard
> links to compacted sstables. however, this can bloat the size of your
> backups
>
>
>
> --> again as stated, it is a general industry practice to place backups in
> a different secondary storage location than the main production site. So
> best to move it to the secondary storage before applying rm on the backups
> folder
>
>
>
> In my experience with production clusters, managing the backups folder
> across multiple nodes can be painful if the objective is to ever recover
> data. With the usual disclaimers, better to rely on third party vendors to
> accomplish the needful rather than scripts/tablesnap.
>
>
>
> Regards
>
> Prasenjit
>
>
>
>
>
> On Wed, Jan 11, 2017 at 7:49 AM, Khaja, Raziuddin (NIH/NLM/NCBI) [C] <
> raziuddin.kh...@nih.gov> wrote:
>
> Hello Kunal,
>
>
>
> Caveat: I am not a super-expert on Cassandra, but it helps to explain to
> others, in order to eventually become an expert, so if my explanation is
> wrong, I would hope others would correct me. J
>
>
>
> The active sstables/data files are are all the files located in the
> directory for the table.
>
> You can safely remove all fil

Re: Backups eating up disk space

2017-01-11 Thread Kunal Gangakhedkar
Thanks for the reply, Razi.

As I mentioned earlier, we're not currently using snapshots - it's only the
backups that are bothering me right now.

So my next question is pertaining to this statement of yours:

As far as I am aware, using *rm* is perfectly safe to delete the
> directories for snapshots/backups as long as you are careful not to delete
> your actively used sstable files and directories.


How do I find out which are the actively used sstables?
If by that you mean the main data files, does that mean I can safely remove
all files ONLY under the "backups/" directory?
Or, removing any files that are current hard-links inside backups can
potentially cause any issues?

Thanks,
Kunal

On 11 January 2017 at 01:06, Khaja, Raziuddin (NIH/NLM/NCBI) [C] <
raziuddin.kh...@nih.gov> wrote:

> Hello Kunal,
>
>
>
> I would take a look at the following configuration options in the
> Cassandra.yaml
>
>
>
> *Common automatic backup settings*
>
> *Incremental_backups:*
>
> http://docs.datastax.com/en/archived/cassandra/3.x/
> cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__
> incremental_backups
>
>
>
> (Default: false) Backs up data updated since the last snapshot was taken.
> When enabled, Cassandra creates a hard link to each SSTable flushed or
> streamed locally in a backups subdirectory of the keyspace data. Removing
> these links is the operator's responsibility.
>
>
>
> *snapshot_before_compaction*:
>
> http://docs.datastax.com/en/archived/cassandra/3.x/
> cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__
> snapshot_before_compaction
>
>
>
> (Default: false) Enables or disables taking a snapshot before each
> compaction. A snapshot is useful to back up data when there is a data
> format change. Be careful using this option: Cassandra does not clean up
> older snapshots automatically.
>
>
>
>
>
> *Advanced automatic backup setting*
>
> *auto_snapshot*:
>
> http://docs.datastax.com/en/archived/cassandra/3.x/
> cassandra/configuration/configCassandra_yaml.html#
> configCassandra_yaml__auto_snapshot
>
>
>
> (Default: true) Enables or disables whether Cassandra takes a snapshot of
> the data before truncating a keyspace or dropping a table. To prevent data
> loss, Datastax strongly advises using the default setting. If you
> set auto_snapshot to false, you lose data on truncation or drop.
>
>
>
>
>
> *nodetool* also provides methods to manage snapshots.
> http://docs.datastax.com/en/archived/cassandra/3.x/
> cassandra/tools/toolsNodetool.html
>
> See the specific commands:
>
>- nodetool clearsnapshot
>
> <http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/tools/toolsClearSnapShot.html>
>Removes one or more snapshots.
>- nodetool listsnapshots
>
> <http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/tools/toolsListSnapShots.html>
>Lists snapshot names, size on disk, and true size.
>- nodetool snapshot
>
> <http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/tools/toolsSnapShot.html>
>Take a snapshot of one or more keyspaces, or of a table, to backup
>data.
>
>
>
> As far as I am aware, using *rm* is perfectly safe to delete the
> directories for snapshots/backups as long as you are careful not to delete
> your actively used sstable files and directories.  I think the *nodetool
> clearsnapshot* command is provided so that you don’t accidentally delete
> actively used files.  Last I used *clearsnapshot*, (a very long time
> ago), I thought it left behind the directory, but this could have been
> fixed in newer versions (so you might want to check that).
>
>
>
> HTH
>
> -Razi
>
>
>
>
>
> *From: *Jonathan Haddad <j...@jonhaddad.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Tuesday, January 10, 2017 at 12:26 PM
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Re: Backups eating up disk space
>
>
>
> If you remove the files from the backup directory, you would not have data
> loss in the case of a node going down.  They're hard links to the same
> files that are in your data directory, and are created when an sstable is
> written to disk.  At the time, they take up (almost) no space, so they
> aren't a big deal, but when the sstable gets compacted, they stick around,
> so they end up not freeing space up.
>
>
>
> Usually you use incremental backups as a means of moving the sstables off
> the node to a backup location.  If you're not doing anything with them,
> they're just wasting space and you should d

Re: Backups eating up disk space

2017-01-10 Thread Kunal Gangakhedkar
Thanks for quick reply, Jon.

But, what about in case of node/cluster going down? Would there be data
loss if I remove these files manually?

How is it typically managed in production setups?
What are the best-practices for the same?
Do people take snapshots on each node before removing the backups?

This is my first production deployment - so, still trying to learn.

Thanks,
Kunal

On 10 January 2017 at 21:36, Jonathan Haddad <j...@jonhaddad.com> wrote:

> You can just delete them off the filesystem (rm)
>
> On Tue, Jan 10, 2017 at 8:02 AM Kunal Gangakhedkar <
> kgangakhed...@gmail.com> wrote:
>
>> Hi all,
>>
>> We have a 3-node cassandra cluster with incremental backup set to true.
>> Each node has 1TB data volume that stores cassandra data.
>>
>> The load in the output of 'nodetool status' comes up at around 260GB each
>> node.
>> All our keyspaces use replication factor = 3.
>>
>> However, the df output shows the data volumes consuming around 850GB of
>> space.
>> I checked the keyspace directory structures - most of the space goes in
>> /data///backups.
>>
>> We have never manually run snapshots.
>>
>> What is the typical procedure to clear the backups?
>> Can it be done without taking the node offline?
>>
>> Thanks,
>> Kunal
>>
>


Backups eating up disk space

2017-01-10 Thread Kunal Gangakhedkar
Hi all,

We have a 3-node cassandra cluster with incremental backup set to true.
Each node has 1TB data volume that stores cassandra data.

The load in the output of 'nodetool status' comes up at around 260GB each
node.
All our keyspaces use replication factor = 3.

However, the df output shows the data volumes consuming around 850GB of
space.
I checked the keyspace directory structures - most of the space goes in
/data///backups.

We have never manually run snapshots.

What is the typical procedure to clear the backups?
Can it be done without taking the node offline?

Thanks,
Kunal


Unsubscribe

2016-10-28 Thread Kunal Gaikwad
Unsubscribe

Regards,
Kunal Gaikwad


Cassandra cluster hardware configuration

2016-08-16 Thread Kunal Gaikwad
Hi,

I want to setup a Cassandra cluster of about 3-5 nodes cluster, can anyone
suggest me what hardware configuration should I consider considering the RF
as 3. The data size should be around 100 GB on the DT environment.

Regards,
Kunal Gaikwad


Re: Cassandra OOM on joining existing ring

2015-07-12 Thread Kunal Gangakhedkar
Hi,

Looks like that is my primary problem - the sstable count for the
daily_challenges column family is 5k. Azure had scheduled maintenance
window on Sat. All the VMs got rebooted one by one - including the current
cassandra one - and it's taking forever to bring cassandra back up online.

Is there any way I can re-organize my existing data? so that I can bring
down that count?
I don't want to lose that data.
If possible, can I do that while cassandra is down? As I mentioned, it's
taking forever to get the service up - it's stuck in reading those 5k
sstable (+ another 5k of corresponding secondary index) files. :(
Oh, did I mention I'm new to cassandra?

Thanks,
Kunal

Kunal

On 11 July 2015 at 03:29, Sebastian Estevez sebastian.este...@datastax.com
wrote:

 #1

 There is one table - daily_challenges - which shows compacted partition
 max bytes as ~460M and another one - daily_guest_logins - which shows
 compacted partition max bytes as ~36M.


 460 is high, I like to keep my partitions under 100mb when possible. I've
 seen worse though. The fix is to add something else (maybe month or week or
 something) into your partition key:

  PRIMARY KEY ((segment_type, something_else), date, user_id, sess_id)

 #2 looks like your jam version is 3 per your env.sh so you're probably
 okay to copy the env.sh over from the C* 3.0 link I shared once you
 uncomment and tweak the MAX_HEAP. If there's something wrong your node
 won't come up. tail your logs.



 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Fri, Jul 10, 2015 at 2:44 PM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 And here is my cassandra-env.sh
 https://gist.github.com/kunalg/2c092cb2450c62be9a20

 Kunal

 On 11 July 2015 at 00:04, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 From jhat output, top 10 entries for Instance Count for All Classes
 (excluding platform) shows:

 2088223 instances of class org.apache.cassandra.db.BufferCell
 1983245 instances of class
 org.apache.cassandra.db.composites.CompoundSparseCellName
 1885974 instances of class
 org.apache.cassandra.db.composites.CompoundDenseCellName
 63 instances of class
 org.apache.cassandra.io.sstable.IndexHelper$IndexInfo
 503687 instances of class org.apache.cassandra.db.BufferDeletedCell
 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier
 101800 instances of class org.apache.cassandra.utils.concurrent.Ref
 101800 instances of class
 org.apache.cassandra.utils.concurrent.Ref$State
 90704 instances of class
 org.apache.cassandra.utils.concurrent.Ref$GlobalState
 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey

 At the bottom of the page, it shows:
 Total of 8739510 instances occupying 193607512 bytes.
 JFYI.

 Kunal

 On 10 July 2015 at 23:49, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Thanks for quick reply.

 1. I don't know what are the thresholds that I should look for. So, to
 save this back-and-forth, I'm attaching the cfstats output for the 
 keyspace.

 There is one table - daily_challenges - which shows compacted partition
 max bytes as ~460M and another one - daily_guest_logins - which shows
 compacted partition max bytes as ~36M.

 Can that be a problem?
 Here is the CQL schema for the daily_challenges column family:

 CREATE TABLE app_10001.daily_challenges (
 segment_type text,
 date timestamp,
 user_id int,
 sess_id text,
 data text,
 deleted boolean,
 PRIMARY KEY (segment_type, date, user_id, sess_id)
 ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC)
 AND bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class':
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
 'max_threshold': '32'}
 AND compression = {'sstable_compression':
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
Attaching the stack dump captured from the last OOM.

Kunal

On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com
wrote:

 Forgot to mention: the data size is not that big - it's barely 10GB in all.

 Kunal

 On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've restarted
 the cassandra service like 8-10 times hoping that it would finish the
 replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21 version
 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal



ERROR [SharedPool-Worker-6] 2015-07-10 05:12:16,862 
JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting 
forcefully due to:
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.init(HeapByteBuffer.java:57) ~[na:1.8.0_45]
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_45]
at 
org.apache.cassandra.utils.memory.SlabAllocator.getRegion(SlabAllocator.java:137)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:97) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.utils.memory.MemtableBufferAllocator.clone(MemtableBufferAllocator.java:61)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Memtable.put(Memtable.java:192) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1212) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.index.AbstractSimplePerColumnSecondaryIndex.insert(AbstractSimplePerColumnSecondaryIndex.java:131)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.index.SecondaryIndexManager$StandardUpdater.insert(SecondaryIndexManager.java:791)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.AtomicBTreeColumns$ColumnUpdater.apply(AtomicBTreeColumns.java:444)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.AtomicBTreeColumns$ColumnUpdater.apply(AtomicBTreeColumns.java:418)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.utils.btree.BTree.build(BTree.java:116) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.utils.btree.BTree.update(BTree.java:177) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.AtomicBTreeColumns.addAllWithSizeDelta(AtomicBTreeColumns.java:225)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Memtable.put(Memtable.java:210) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1212) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:389) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:352) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:54) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) 
~[apache-cassandra-2.1.7.jar:2.1.7]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_45]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 ~[apache-cassandra-2.1.7.jar:2.1.7]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.1.7.jar:2.1.7]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
ERROR [CompactionExecutor:3] 2015-07-10 05:12:16,862 CassandraDaemon.java:223 - 
Exception in thread Thread[CompactionExecutor:3,1,main]
java.lang.OutOfMemoryError: Java heap space
at java.util.ArrayDeque.doubleCapacity(ArrayDeque.java:157) 
~[na:1.8.0_45

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
Forgot to mention: the data size is not that big - it's barely 10GB in all.

Kunal

On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've restarted
 the cassandra service like 8-10 times hoping that it would finish the
 replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21 version
 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal



Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
Hi,

I have a 2 node setup on Azure (east us region) running Ubuntu server
14.04LTS.
Both nodes have 8GB RAM.

One of the nodes (seed node) died with OOM - so, I am trying to add a
replacement node with same configuration.

The problem is this new node also keeps dying with OOM - I've restarted the
cassandra service like 8-10 times hoping that it would finish the
replication. But it didn't help.

The one node that is still up is happily chugging along.
All nodes have similar configuration - with libjna installed.

Cassandra is installed from datastax's debian repo - pkg: dsc21 version
2.1.7.
I started off with the default configuration - i.e. the default
cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
= 2GB)

But, that didn't help. So, I then tried to increase the heap to 4GB
manually and restarted. It still keeps crashing.

Any clue as to why it's happening?

Thanks,
Kunal


Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
I'm new to cassandra
How do I find those out? - mainly, the partition params that you asked for.
Others, I think I can figure out.

We don't have any large objects/blobs in the column values - it's all
textual, date-time, numeric and uuid data.

We use cassandra to primarily store segmentation data - with segment type
as partition key. That is again divided into two separate column families;
but they have similar structure.

Columns per row can be fairly large - each segment type as the row key and
associated user ids and timestamp as column value.

Thanks,
Kunal

On 10 July 2015 at 16:36, Jack Krupansky jack.krupan...@gmail.com wrote:

 What does your data and data model look like - partition size, rows per
 partition, number of columns per row, any large values/blobs in column
 values?

 You could run fine on an 8GB system, but only if your rows and partitions
 are reasonably small. Any large partitions could blow you away.

 -- Jack Krupansky

 On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Attaching the stack dump captured from the last OOM.

 Kunal

 On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Forgot to mention: the data size is not that big - it's barely 10GB in
 all.

 Kunal

 On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've restarted
 the cassandra service like 8-10 times hoping that it would finish the
 replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21 version
 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal







Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
Thanks for quick reply.

1. I don't know what are the thresholds that I should look for. So, to save
this back-and-forth, I'm attaching the cfstats output for the keyspace.

There is one table - daily_challenges - which shows compacted partition max
bytes as ~460M and another one - daily_guest_logins - which shows compacted
partition max bytes as ~36M.

Can that be a problem?
Here is the CQL schema for the daily_challenges column family:

CREATE TABLE app_10001.daily_challenges (
segment_type text,
date timestamp,
user_id int,
sess_id text,
data text,
deleted boolean,
PRIMARY KEY (segment_type, date, user_id, sess_id)
) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{keys:ALL, rows_per_partition:NONE}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32'}
AND compression = {'sstable_compression':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';

CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted);


2. I don't know - how do I check? As I mentioned, I just installed the
dsc21 update from datastax's debian repo (ver 2.1.7).

Really appreciate your help.

Thanks,
Kunal

On 10 July 2015 at 23:33, Sebastian Estevez sebastian.este...@datastax.com
wrote:

 1. You want to look at # of sstables in cfhistograms or in cfstats look at:
 Compacted partition maximum bytes
 Maximum live cells per slice

 2) No, here's the env.sh from 3.0 which should work with some tweaks:

 https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh

 You'll at least have to modify the jamm version to what's in yours. I
 think it's 2.5



 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Thanks, Sebastian.

 Couple of questions (I'm really new to cassandra):
 1. How do I interpret the output of 'nodetool cfstats' to figure out the
 issues? Any documentation pointer on that would be helpful.

 2. I'm primarily a python/c developer - so, totally clueless about JVM
 environment. So, please bare with me as I would need a lot of hand-holding.
 Should I just copy+paste the settings you gave and try to restart the
 failing cassandra server?

 Thanks,
 Kunal

 On 10 July 2015 at 22:35, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 #1 You need more information.

 a) Take a look at your .hprof file (memory heap from the OOM) with an
 introspection tool like jhat or visualvm or java flight recorder and see
 what is using up your RAM.

 b) How big are your large rows (use nodetool cfstats on each node). If
 your data model is bad, you are going to have to re-design it no matter
 what.

 #2 As a possible workaround try using the G1GC allocator with the
 settings from c* 3.0 instead of CMS. I've seen lots of success with it
 lately (tl;dr G1GC is much simpler than CMS and almost as good as a finely
 tuned CMS). *Note:* Use it with the latest Java 8 from Oracle. Do *not*
 set the newgen size for G1 sets it dynamically:

 # min and max heap sizes should be set to the same value to avoid
 # stop-the-world GC pauses during resize, and so that we can lock the
 # heap in memory on startup to prevent any of it from being swapped
 # out.
 JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}

 # Per-thread stack size.
 JVM_OPTS=$JVM_OPTS -Xss256k

 # Use the Hotspot garbage-first collector.
 JVM_OPTS=$JVM_OPTS -XX:+UseG1GC

 # Have the JVM do less remembered set work during STW, instead
 # preferring concurrent GC. Reduces p99.9 latency.
 JVM_OPTS=$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5

 # The JVM maximum is 8 PGC threads and 1/4

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
And here is my cassandra-env.sh
https://gist.github.com/kunalg/2c092cb2450c62be9a20

Kunal

On 11 July 2015 at 00:04, Kunal Gangakhedkar kgangakhed...@gmail.com
wrote:

 From jhat output, top 10 entries for Instance Count for All Classes
 (excluding platform) shows:

 2088223 instances of class org.apache.cassandra.db.BufferCell
 1983245 instances of class
 org.apache.cassandra.db.composites.CompoundSparseCellName
 1885974 instances of class
 org.apache.cassandra.db.composites.CompoundDenseCellName
 63 instances of class
 org.apache.cassandra.io.sstable.IndexHelper$IndexInfo
 503687 instances of class org.apache.cassandra.db.BufferDeletedCell
 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier
 101800 instances of class org.apache.cassandra.utils.concurrent.Ref
 101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State
 90704 instances of class
 org.apache.cassandra.utils.concurrent.Ref$GlobalState
 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey

 At the bottom of the page, it shows:
 Total of 8739510 instances occupying 193607512 bytes.
 JFYI.

 Kunal

 On 10 July 2015 at 23:49, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Thanks for quick reply.

 1. I don't know what are the thresholds that I should look for. So, to
 save this back-and-forth, I'm attaching the cfstats output for the keyspace.

 There is one table - daily_challenges - which shows compacted partition
 max bytes as ~460M and another one - daily_guest_logins - which shows
 compacted partition max bytes as ~36M.

 Can that be a problem?
 Here is the CQL schema for the daily_challenges column family:

 CREATE TABLE app_10001.daily_challenges (
 segment_type text,
 date timestamp,
 user_id int,
 sess_id text,
 data text,
 deleted boolean,
 PRIMARY KEY (segment_type, date, user_id, sess_id)
 ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC)
 AND bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class':
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
 'max_threshold': '32'}
 AND compression = {'sstable_compression':
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';

 CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted);


 2. I don't know - how do I check? As I mentioned, I just installed the
 dsc21 update from datastax's debian repo (ver 2.1.7).

 Really appreciate your help.

 Thanks,
 Kunal

 On 10 July 2015 at 23:33, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 1. You want to look at # of sstables in cfhistograms or in cfstats look
 at:
 Compacted partition maximum bytes
 Maximum live cells per slice

 2) No, here's the env.sh from 3.0 which should work with some tweaks:

 https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh

 You'll at least have to modify the jamm version to what's in yours. I
 think it's 2.5



 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Thanks, Sebastian.

 Couple of questions (I'm really new to cassandra):
 1. How do I interpret the output of 'nodetool cfstats' to figure out
 the issues? Any documentation pointer on that would be helpful.

 2. I'm primarily a python/c developer - so, totally clueless about JVM
 environment. So, please bare with me as I would need a lot of hand-holding.
 Should I just copy+paste the settings you gave and try to restart the
 failing cassandra server?

 Thanks,
 Kunal

 On 10 July 2015 at 22:35, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 #1 You need more information.

 a) Take a look at your .hprof file (memory heap from the OOM

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
Thanks, Sebastian.

Couple of questions (I'm really new to cassandra):
1. How do I interpret the output of 'nodetool cfstats' to figure out the
issues? Any documentation pointer on that would be helpful.

2. I'm primarily a python/c developer - so, totally clueless about JVM
environment. So, please bare with me as I would need a lot of hand-holding.
Should I just copy+paste the settings you gave and try to restart the
failing cassandra server?

Thanks,
Kunal

On 10 July 2015 at 22:35, Sebastian Estevez sebastian.este...@datastax.com
wrote:

 #1 You need more information.

 a) Take a look at your .hprof file (memory heap from the OOM) with an
 introspection tool like jhat or visualvm or java flight recorder and see
 what is using up your RAM.

 b) How big are your large rows (use nodetool cfstats on each node). If
 your data model is bad, you are going to have to re-design it no matter
 what.

 #2 As a possible workaround try using the G1GC allocator with the settings
 from c* 3.0 instead of CMS. I've seen lots of success with it lately (tl;dr
 G1GC is much simpler than CMS and almost as good as a finely tuned CMS).
 *Note:* Use it with the latest Java 8 from Oracle. Do *not* set the
 newgen size for G1 sets it dynamically:

 # min and max heap sizes should be set to the same value to avoid
 # stop-the-world GC pauses during resize, and so that we can lock the
 # heap in memory on startup to prevent any of it from being swapped
 # out.
 JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}

 # Per-thread stack size.
 JVM_OPTS=$JVM_OPTS -Xss256k

 # Use the Hotspot garbage-first collector.
 JVM_OPTS=$JVM_OPTS -XX:+UseG1GC

 # Have the JVM do less remembered set work during STW, instead
 # preferring concurrent GC. Reduces p99.9 latency.
 JVM_OPTS=$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5

 # The JVM maximum is 8 PGC threads and 1/4 of that for ConcGC.
 # Machines with  10 cores may need additional threads.
 # Increase to = full cores (do not count HT cores).
 #JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=16
 #JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=16

 # Main G1GC tunable: lowering the pause target will lower throughput and
 vise versa.
 # 200ms is the JVM default and lowest viable setting
 # 1000ms increases throughput. Keep it smaller than the timeouts in
 cassandra.yaml.
 JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=500
 # Do reference processing in parallel GC.
 JVM_OPTS=$JVM_OPTS -XX:+ParallelRefProcEnabled

 # This may help eliminate STW.
 # The default in Hotspot 8u40 is 40%.
 #JVM_OPTS=$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=25

 # For workloads that do large allocations, increasing the region
 # size may make things more efficient. Otherwise, let the JVM
 # set this automatically.
 #JVM_OPTS=$JVM_OPTS -XX:G1HeapRegionSize=32m

 # Make sure all memory is faulted and zeroed on startup.
 # This helps prevent soft faults in containers and makes
 # transparent hugepage allocation more effective.
 JVM_OPTS=$JVM_OPTS -XX:+AlwaysPreTouch

 # Biased locking does not benefit Cassandra.
 JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking

 # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
 JVM_OPTS=$JVM_OPTS -XX:StringTableSize=103

 # Enable thread-local allocation blocks and allow the JVM to automatically
 # resize them at runtime.
 JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB

 # http://www.evanjones.ca/jvm-mmap-pause.html
 JVM_OPTS=$JVM_OPTS -XX:+PerfDisableSharedMem


 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Fri, Jul 10, 2015 at 12:55 PM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 I upgraded my instance from 8GB to a 14GB one.
 Allocated 8GB to jvm heap in cassandra-env.sh.

 And now, it crashes even faster with an OOM..

 Earlier, with 4GB heap, I could go upto ~90% replication completion (as
 reported by nodetool netstats); now, with 8GB heap, I cannot even get
 there. I've already restarted cassandra service 4 times with 8GB heap.

 No clue what's going on.. :(

 Kunal

 On 10 July 2015 at 17:45, Jack Krupansky jack.krupan...@gmail.com
 wrote:

 You, and only you, are responsible for knowing your

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
From jhat output, top 10 entries for Instance Count for All Classes
(excluding platform) shows:

2088223 instances of class org.apache.cassandra.db.BufferCell
1983245 instances of class
org.apache.cassandra.db.composites.CompoundSparseCellName
1885974 instances of class
org.apache.cassandra.db.composites.CompoundDenseCellName
63 instances of class
org.apache.cassandra.io.sstable.IndexHelper$IndexInfo
503687 instances of class org.apache.cassandra.db.BufferDeletedCell
378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier
101800 instances of class org.apache.cassandra.utils.concurrent.Ref
101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State
90704 instances of class
org.apache.cassandra.utils.concurrent.Ref$GlobalState
71123 instances of class org.apache.cassandra.db.BufferDecoratedKey

At the bottom of the page, it shows:
Total of 8739510 instances occupying 193607512 bytes.
JFYI.

Kunal

On 10 July 2015 at 23:49, Kunal Gangakhedkar kgangakhed...@gmail.com
wrote:

 Thanks for quick reply.

 1. I don't know what are the thresholds that I should look for. So, to
 save this back-and-forth, I'm attaching the cfstats output for the keyspace.

 There is one table - daily_challenges - which shows compacted partition
 max bytes as ~460M and another one - daily_guest_logins - which shows
 compacted partition max bytes as ~36M.

 Can that be a problem?
 Here is the CQL schema for the daily_challenges column family:

 CREATE TABLE app_10001.daily_challenges (
 segment_type text,
 date timestamp,
 user_id int,
 sess_id text,
 data text,
 deleted boolean,
 PRIMARY KEY (segment_type, date, user_id, sess_id)
 ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC)
 AND bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = ''
 AND compaction = {'min_threshold': '4', 'class':
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
 'max_threshold': '32'}
 AND compression = {'sstable_compression':
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';

 CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted);


 2. I don't know - how do I check? As I mentioned, I just installed the
 dsc21 update from datastax's debian repo (ver 2.1.7).

 Really appreciate your help.

 Thanks,
 Kunal

 On 10 July 2015 at 23:33, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 1. You want to look at # of sstables in cfhistograms or in cfstats look
 at:
 Compacted partition maximum bytes
 Maximum live cells per slice

 2) No, here's the env.sh from 3.0 which should work with some tweaks:

 https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh

 You'll at least have to modify the jamm version to what's in yours. I
 think it's 2.5



 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Thanks, Sebastian.

 Couple of questions (I'm really new to cassandra):
 1. How do I interpret the output of 'nodetool cfstats' to figure out the
 issues? Any documentation pointer on that would be helpful.

 2. I'm primarily a python/c developer - so, totally clueless about JVM
 environment. So, please bare with me as I would need a lot of hand-holding.
 Should I just copy+paste the settings you gave and try to restart the
 failing cassandra server?

 Thanks,
 Kunal

 On 10 July 2015 at 22:35, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 #1 You need more information.

 a) Take a look at your .hprof file (memory heap from the OOM) with an
 introspection tool like jhat or visualvm or java flight recorder and see
 what is using up your RAM.

 b) How big are your large rows (use nodetool cfstats on each node). If
 your data

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
I upgraded my instance from 8GB to a 14GB one.
Allocated 8GB to jvm heap in cassandra-env.sh.

And now, it crashes even faster with an OOM..

Earlier, with 4GB heap, I could go upto ~90% replication completion (as
reported by nodetool netstats); now, with 8GB heap, I cannot even get
there. I've already restarted cassandra service 4 times with 8GB heap.

No clue what's going on.. :(

Kunal

On 10 July 2015 at 17:45, Jack Krupansky jack.krupan...@gmail.com wrote:

 You, and only you, are responsible for knowing your data and data model.

 If columns per row or rows per partition can be large, then an 8GB system
 is probably too small. But the real issue is that you need to keep your
 partition size from getting too large.

 Generally, an 8GB system is okay, but only for reasonably-sized
 partitions, like under 10MB.


 -- Jack Krupansky

 On Fri, Jul 10, 2015 at 8:05 AM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 I'm new to cassandra
 How do I find those out? - mainly, the partition params that you asked
 for. Others, I think I can figure out.

 We don't have any large objects/blobs in the column values - it's all
 textual, date-time, numeric and uuid data.

 We use cassandra to primarily store segmentation data - with segment type
 as partition key. That is again divided into two separate column families;
 but they have similar structure.

 Columns per row can be fairly large - each segment type as the row key
 and associated user ids and timestamp as column value.

 Thanks,
 Kunal

 On 10 July 2015 at 16:36, Jack Krupansky jack.krupan...@gmail.com
 wrote:

 What does your data and data model look like - partition size, rows per
 partition, number of columns per row, any large values/blobs in column
 values?

 You could run fine on an 8GB system, but only if your rows and
 partitions are reasonably small. Any large partitions could blow you away.

 -- Jack Krupansky

 On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar 
 kgangakhed...@gmail.com wrote:

 Attaching the stack dump captured from the last OOM.

 Kunal

 On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Forgot to mention: the data size is not that big - it's barely 10GB in
 all.

 Kunal

 On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com
 wrote:

 Hi,

 I have a 2 node setup on Azure (east us region) running Ubuntu server
 14.04LTS.
 Both nodes have 8GB RAM.

 One of the nodes (seed node) died with OOM - so, I am trying to add a
 replacement node with same configuration.

 The problem is this new node also keeps dying with OOM - I've
 restarted the cassandra service like 8-10 times hoping that it would 
 finish
 the replication. But it didn't help.

 The one node that is still up is happily chugging along.
 All nodes have similar configuration - with libjna installed.

 Cassandra is installed from datastax's debian repo - pkg: dsc21
 version 2.1.7.
 I started off with the default configuration - i.e. the default
 cassandra-env.sh - which calculates the heap size automatically (1/4 * 
 RAM
 = 2GB)

 But, that didn't help. So, I then tried to increase the heap to 4GB
 manually and restarted. It still keeps crashing.

 Any clue as to why it's happening?

 Thanks,
 Kunal