Re: cassandra OOM

2017-04-03 Thread Alexander Dejanovski
Hi,

we've seen G1GC going OOM on production clusters (repeatedly) with a 16GB
heap when the workload is intense, and given you're running on m4.2xl I
wouldn't go over 16GB for the heap.

I'd suggest to revert back to CMS, using a 16GB heap and up to 6GB of new
gen. You can use 5 as MaxTenuringThreshold as an initial value and activate
GC logging to fine tune the settings afterwards.

FYI CMS tends to perform better than G1 even though it's a little bit
harder to tune.

Cheers,

On Mon, Apr 3, 2017 at 10:54 PM Gopal, Dhruva 
wrote:

> 16 Gig heap, with G1. Pertinent info from jvm.options below (we’re using
> m2.2xlarge instances in AWS):
>
>
>
>
>
> #
>
> # HEAP SETTINGS #
>
> #
>
>
>
> # Heap size is automatically calculated by cassandra-env based on this
>
> # formula: max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))
>
> # That is:
>
> # - calculate 1/2 ram and cap to 1024MB
>
> # - calculate 1/4 ram and cap to 8192MB
>
> # - pick the max
>
> #
>
> # For production use you may wish to adjust this for your environment.
>
> # If that's the case, uncomment the -Xmx and Xms options below to override
> the
>
> # automatic calculation of JVM heap memory.
>
> #
>
> # It is recommended to set min (-Xms) and max (-Xmx) heap sizes to
>
> # the same value to avoid stop-the-world GC pauses during resize, and
>
> # so that we can lock the heap in memory on startup to prevent any
>
> # of it from being swapped out.
>
> -Xms16G
>
> -Xmx16G
>
>
>
> # Young generation size is automatically calculated by cassandra-env
>
> # based on this formula: min(100 * num_cores, 1/4 * heap size)
>
> #
>
> # The main trade-off for the young generation is that the larger it
>
> # is, the longer GC pause times will be. The shorter it is, the more
>
> # expensive GC will be (usually).
>
> #
>
> # It is not recommended to set the young generation size if using the
>
> # G1 GC, since that will override the target pause-time goal.
>
> # More info:
> http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html
>
> #
>
> # The example below assumes a modern 8-core+ machine for decent
>
> # times. If in doubt, and if you do not particularly want to tweak, go
>
> # 100 MB per physical CPU core.
>
> #-Xmn800M
>
>
>
> #
>
> #  GC SETTINGS  #
>
> #
>
>
>
> ### CMS Settings
>
>
>
> #-XX:+UseParNewGC
>
> #-XX:+UseConcMarkSweepGC
>
> #-XX:+CMSParallelRemarkEnabled
>
> #-XX:SurvivorRatio=8
>
> #-XX:MaxTenuringThreshold=1
>
> #-XX:CMSInitiatingOccupancyFraction=75
>
> #-XX:+UseCMSInitiatingOccupancyOnly
>
> #-XX:CMSWaitDuration=1
>
> #-XX:+CMSParallelInitialMarkEnabled
>
> #-XX:+CMSEdenChunksRecordAlways
>
> # some JVMs will fill up their heap when accessed via JMX, see
> CASSANDRA-6541
>
> #-XX:+CMSClassUnloadingEnabled
>
>
>
> ### G1 Settings (experimental, comment previous section and uncomment
> section below to enable)
>
>
>
> ## Use the Hotspot garbage-first collector.
>
> -XX:+UseG1GC
>
> #
>
> ## Have the JVM do less remembered set work during STW, instead
>
> ## preferring concurrent GC. Reduces p99.9 latency.
>
> -XX:G1RSetUpdatingPauseTimePercent=5
>
> #
>
> ## Main G1GC tunable: lowering the pause target will lower throughput and
> vise versa.
>
> ## 200ms is the JVM default and lowest viable setting
>
> ## 1000ms increases throughput. Keep it smaller than the timeouts in
> cassandra.yaml.
>
> -XX:MaxGCPauseMillis=500
>
>
>
> ## Optional G1 Settings
>
>
>
> # Save CPU time on large (>= 16GB) heaps by delaying region scanning
>
> # until the heap is 70% full. The default in Hotspot 8u40 is 40%.
>
> -XX:InitiatingHeapOccupancyPercent=70
>
>
>
> # For systems with > 8 cores, the default ParallelGCThreads is 5/8 the
> number of logical cores.
>
> # Otherwise equal to the number of cores when 8 or less.
>
> # Machines with > 10 cores should try setting these to <= full cores.
>
> #-XX:ParallelGCThreads=16
>
> # By default, ConcGCThreads is 1/4 of ParallelGCThreads.
>
> # Setting both to the same value can reduce STW durations.
>
> #-XX:ConcGCThreads=16
>
>
>
> ### GC logging options -- uncomment to enable
>
>
>
> #-XX:+PrintGCDetails
>
> #-XX:+PrintGCDateStamps
>
> #-XX:+PrintHeapAtGC
>
> #-XX:+PrintTenuringDistribution
>
> #-XX:+PrintGCApplicationStoppedTime
>
> #-XX:+PrintPromotionFailure
>
> #-XX:PrintFLSStatistics=1
>
> #-Xloggc:/var/log/cassandra/gc.log
>
> #-XX:+UseGCLogFileRotation
>
> #-XX:NumberOfGCLogFiles=10
>
> #-XX:GCLogFileSize=10M
>
>
>
>
>
> *From: *Alexander Dejanovski 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Monday, April 3, 2017 at 8:00 AM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Re: cassandra OOM
>
>
>
> Hi,
>
>
>
> could you share your GC settings ? G1 or CMS ? Heap size, etc...
>
>
>
> Thanks,
>
>
>
> On Sun, Apr 2, 2017 at 10:30 PM Gopal, Dhruva 
> wrote:
>
> Hi –
>
>   We’ve had what looks like an OOM 

Re: Node Gossiping Information.

2017-04-03 Thread Jeff Jirsa


On 2017-04-02 11:27 (-0700), Pranay akula  wrote: 
> where can we check  gossip information of a node ??  I couldn't find
> anything in System keyspace.
> 
> Is it possible to update or refresh Gossiping information on a node without
> restarting. Does enabling and disabling Gossip will help to refresh Gossip
> information on that node.
> 


"nodetool gossipinfo"




Re: nodetool tablestats reporting local read count of 0, incorrectly

2017-04-03 Thread Jeff Jirsa


On 2017-04-03 12:42 (-0700), Voytek Jarnot  wrote: 
> Continuing to grasp at straws...
> 
> Is it possible that indexing is modifying the read path such that the
> tablestats/tablehistograms output is no longer trustworthy?  I notice more
> realistic "local read count" numbers on tables which do not utilize SASI.
> 
> Would greatly appreciate any thoughts,
> Thanks.
> 

Not the most ridiculous thought, though that would not be expected. Would be 
great if you could open a JIRA.



Re: Maximum memory usage reached in cassandra!

2017-04-03 Thread Mark Rose
You may have better luck switching to G1GC and using a much larger
heap (16 to 30GB). 4GB is likely too small for your amount of data,
especially if you have a lot of sstables. Then try increasing
file_cache_size_in_mb further.

Cheers,
Mark

On Tue, Mar 28, 2017 at 3:01 AM, Mokkapati, Bhargav (Nokia -
IN/Chennai)  wrote:
> Hi Cassandra users,
>
>
>
> I am getting “Maximum memory usage reached (536870912 bytes), cannot
> allocate chunk of 1048576 bytes” . As a remedy I have changed the off heap
> memory usage limit cap i.e file_cache_size_in_mb parameter in cassandra.yaml
> from 512 to 1024.
>
>
>
> But now again the increased limit got filled up and throwing a message
> “Maximum memory usage reached (1073741824 bytes), cannot allocate chunk of
> 1048576 bytes”
>
>
>
> This issue occurring when redistribution of index’s happening ,due to this
> Cassandra nodes are still UP but read requests are getting failed from
> application side.
>
>
>
> My configuration details are as below:
>
>
>
> 5 node cluster , each node with 68 disks, each disk is 3.7 TB
>
>
>
> Total CPU cores - 8
>
>
>
> total  Mem:377G
>
> used  265G
>
> free   58G
>
> shared  378M
>
> buff/cache 53G
>
> available 104G
>
>
>
> MAX_HEAP_SIZE is 4GB
>
> file_cache_size_in_mb: 1024
>
>
>
> memtable heap space is commented in yaml file as below:
>
> # memtable_heap_space_in_mb: 2048
>
> # memtable_offheap_space_in_mb: 2048
>
>
>
> Can anyone please suggest the solution for this issue. Thanks in advance !
>
>
>
> Thanks,
>
> Bhargav M
>
>
>
>
>
>
>
>


Re: cassandra OOM

2017-04-03 Thread Gopal, Dhruva
16 Gig heap, with G1. Pertinent info from jvm.options below (we’re using 
m2.2xlarge instances in AWS):


#
# HEAP SETTINGS #
#

# Heap size is automatically calculated by cassandra-env based on this
# formula: max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))
# That is:
# - calculate 1/2 ram and cap to 1024MB
# - calculate 1/4 ram and cap to 8192MB
# - pick the max
#
# For production use you may wish to adjust this for your environment.
# If that's the case, uncomment the -Xmx and Xms options below to override the
# automatic calculation of JVM heap memory.
#
# It is recommended to set min (-Xms) and max (-Xmx) heap sizes to
# the same value to avoid stop-the-world GC pauses during resize, and
# so that we can lock the heap in memory on startup to prevent any
# of it from being swapped out.
-Xms16G
-Xmx16G

# Young generation size is automatically calculated by cassandra-env
# based on this formula: min(100 * num_cores, 1/4 * heap size)
#
# The main trade-off for the young generation is that the larger it
# is, the longer GC pause times will be. The shorter it is, the more
# expensive GC will be (usually).
#
# It is not recommended to set the young generation size if using the
# G1 GC, since that will override the target pause-time goal.
# More info: http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html
#
# The example below assumes a modern 8-core+ machine for decent
# times. If in doubt, and if you do not particularly want to tweak, go
# 100 MB per physical CPU core.
#-Xmn800M

#
#  GC SETTINGS  #
#

### CMS Settings

#-XX:+UseParNewGC
#-XX:+UseConcMarkSweepGC
#-XX:+CMSParallelRemarkEnabled
#-XX:SurvivorRatio=8
#-XX:MaxTenuringThreshold=1
#-XX:CMSInitiatingOccupancyFraction=75
#-XX:+UseCMSInitiatingOccupancyOnly
#-XX:CMSWaitDuration=1
#-XX:+CMSParallelInitialMarkEnabled
#-XX:+CMSEdenChunksRecordAlways
# some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
#-XX:+CMSClassUnloadingEnabled

### G1 Settings (experimental, comment previous section and uncomment section 
below to enable)

## Use the Hotspot garbage-first collector.
-XX:+UseG1GC
#
## Have the JVM do less remembered set work during STW, instead
## preferring concurrent GC. Reduces p99.9 latency.
-XX:G1RSetUpdatingPauseTimePercent=5
#
## Main G1GC tunable: lowering the pause target will lower throughput and vise 
versa.
## 200ms is the JVM default and lowest viable setting
## 1000ms increases throughput. Keep it smaller than the timeouts in 
cassandra.yaml.
-XX:MaxGCPauseMillis=500

## Optional G1 Settings

# Save CPU time on large (>= 16GB) heaps by delaying region scanning
# until the heap is 70% full. The default in Hotspot 8u40 is 40%.
-XX:InitiatingHeapOccupancyPercent=70

# For systems with > 8 cores, the default ParallelGCThreads is 5/8 the number 
of logical cores.
# Otherwise equal to the number of cores when 8 or less.
# Machines with > 10 cores should try setting these to <= full cores.
#-XX:ParallelGCThreads=16
# By default, ConcGCThreads is 1/4 of ParallelGCThreads.
# Setting both to the same value can reduce STW durations.
#-XX:ConcGCThreads=16

### GC logging options -- uncomment to enable

#-XX:+PrintGCDetails
#-XX:+PrintGCDateStamps
#-XX:+PrintHeapAtGC
#-XX:+PrintTenuringDistribution
#-XX:+PrintGCApplicationStoppedTime
#-XX:+PrintPromotionFailure
#-XX:PrintFLSStatistics=1
#-Xloggc:/var/log/cassandra/gc.log
#-XX:+UseGCLogFileRotation
#-XX:NumberOfGCLogFiles=10
#-XX:GCLogFileSize=10M


From: Alexander Dejanovski 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, April 3, 2017 at 8:00 AM
To: "user@cassandra.apache.org" 
Subject: Re: cassandra OOM

Hi,

could you share your GC settings ? G1 or CMS ? Heap size, etc...

Thanks,

On Sun, Apr 2, 2017 at 10:30 PM Gopal, Dhruva 
> wrote:
Hi –
  We’ve had what looks like an OOM situation with Cassandra (we have a dump 
file that got generated) in our staging (performance/load testing environment) 
and I wanted to reach out to this user group to see if you had any 
recommendations on how we should approach our investigation as to the cause of 
this issue. The logs don’t seem to point to any obvious issues, and we’re no 
experts in analyzing this by any means, so was looking for guidance on how to 
proceed. Should we enter a Jira as well? We’re on Cassandra 3.9, and are 
running  a six node cluster. This happened in a controlled load testing 
environment. Feedback will be much appreciated!


Regards,
Dhruva

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute 

cql3 - adding two numbers in insert statement

2017-04-03 Thread Shreyas Chandra Sekhar
Hi,

I am trying to generate a random value of certain length and use that as one
of the value in CQL3. Below is an example

 

INSERT INTO "KS"."CF" (key, column1, value) VALUES
(613462303431313435313838306530667c6263317431756331, 2633174317563312f6f36,
blobAsUuid(timeuuidAsBlob(now())) + 1000);

 

This errors, can anyone help with right syntax?

 

Thanks,

Shreyas



Re: Cassandra rpm for 3.10

2017-04-03 Thread Michael Shuler
On 04/03/2017 02:58 PM, mahesh rajamani wrote:
> Hi,
> 
> Can you please let me know where I can get Cassandra 3.10 RPM? If its
> not available, instruction to build it would be helpful.

Check out the 'cassandra-3.10' tag and follow the README instructions in
the redhat/ directory.

https://github.com/apache/cassandra/tree/cassandra-3.10/redhat

-- 
Kind regards,
Michael


Cassandra rpm for 3.10

2017-04-03 Thread mahesh rajamani
Hi,

Can you please let me know where I can get Cassandra 3.10 RPM? If its not
available, instruction to build it would be helpful.


-- 
Regards,
Mahesh Rajamani


Re: nodetool tablestats reporting local read count of 0, incorrectly

2017-04-03 Thread Voytek Jarnot
Continuing to grasp at straws...

Is it possible that indexing is modifying the read path such that the
tablestats/tablehistograms output is no longer trustworthy?  I notice more
realistic "local read count" numbers on tables which do not utilize SASI.

Would greatly appreciate any thoughts,
Thanks.

On Mon, Apr 3, 2017 at 9:56 AM, Voytek Jarnot 
wrote:

> Further info - tablehistograms reports zeros for all percentiles for Read
> Latency; tablestats also reports really low numbers for Bloom filter usage
> (3-4 KiB, depending on node, whereas I'd expect orders of magnitude more
> given other - less accessed - tables in this keyspace).  This is the most
> written-to and read-from table in the keyspace, seems to keep up with
> tracking of writes, but not reads.
>
> Full repair on this table is the only thing I can think of; but that's a
> guess and doesn't get me any closer to understanding what has happened.
>
> On Fri, Mar 31, 2017 at 11:11 PM, Voytek Jarnot 
> wrote:
>
>> Cassandra 3.9
>>
>> Have a keyspace with 5 tables, one of which is exhibiting rather poor
>> read performance. In starting an attempt to get to the bottom of the
>> issues, I noticed that, when running nodetool tablestats against the
>> keyspace, that particular table reports "Local read count: 0" on all nodes
>> - which is incorrect.
>>
>> It tallies "Local write count", presumably correctly, as at least it's
>> not 0. Other tables in the keyspace do not exhibit this behavior, as they
>> provide non-zero numbers for both read and write values.
>>
>> Is this perhaps indicative of a deeper issue with this particular table?
>>
>> Thank you.
>>
>
>


RE: Can we get username and timestamp in cqlsh_history?

2017-04-03 Thread Durity, Sean R
Sounds like you want full auditing of CQL in the cluster. I have not seen 
anything built into the open source version for that (but I could be missing 
something). DataStax Enterprise does have an auditing feature.


Sean Durity

From: anuja jain [mailto:anujaja...@gmail.com]
Sent: Wednesday, March 29, 2017 7:37 AM
To: user@cassandra.apache.org
Subject: Can we get username and timestamp in cqlsh_history?

Hi,
I have a cassandra cluster having a lot of keyspaces and users. I want to get 
the history of cql commands along with the username and the time at which the 
command is run.
Also if we are running some commands from GUI tools like Devcenter,dbeaver, can 
we log those commands too? If yes, how?

Thanks,
Anuja



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: cassandra OOM

2017-04-03 Thread Alexander Dejanovski
Hi,

could you share your GC settings ? G1 or CMS ? Heap size, etc...

Thanks,

On Sun, Apr 2, 2017 at 10:30 PM Gopal, Dhruva 
wrote:

> Hi –
>
>   We’ve had what looks like an OOM situation with Cassandra (we have a
> dump file that got generated) in our staging (performance/load testing
> environment) and I wanted to reach out to this user group to see if you had
> any recommendations on how we should approach our investigation as to the
> cause of this issue. The logs don’t seem to point to any obvious issues,
> and we’re no experts in analyzing this by any means, so was looking for
> guidance on how to proceed. Should we enter a Jira as well? We’re on
> Cassandra 3.9, and are running  a six node cluster. This happened in a
> controlled load testing environment. Feedback will be much appreciated!
>
>
>
>
>
> Regards,
>
> Dhruva
>
>
> This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>
-- 
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: nodetool tablestats reporting local read count of 0, incorrectly

2017-04-03 Thread Voytek Jarnot
Further info - tablehistograms reports zeros for all percentiles for Read
Latency; tablestats also reports really low numbers for Bloom filter usage
(3-4 KiB, depending on node, whereas I'd expect orders of magnitude more
given other - less accessed - tables in this keyspace).  This is the most
written-to and read-from table in the keyspace, seems to keep up with
tracking of writes, but not reads.

Full repair on this table is the only thing I can think of; but that's a
guess and doesn't get me any closer to understanding what has happened.

On Fri, Mar 31, 2017 at 11:11 PM, Voytek Jarnot 
wrote:

> Cassandra 3.9
>
> Have a keyspace with 5 tables, one of which is exhibiting rather poor read
> performance. In starting an attempt to get to the bottom of the issues, I
> noticed that, when running nodetool tablestats against the keyspace, that
> particular table reports "Local read count: 0" on all nodes - which is
> incorrect.
>
> It tallies "Local write count", presumably correctly, as at least it's not
> 0. Other tables in the keyspace do not exhibit this behavior, as they
> provide non-zero numbers for both read and write values.
>
> Is this perhaps indicative of a deeper issue with this particular table?
>
> Thank you.
>


Re: Can we get username and timestamp in cqlsh_history?

2017-04-03 Thread Nicolas Guyomar
Hi Anuja,

What your are looking for is provided as part of DSE :
https://docs.datastax.com/en/datastax_enterprise/5.0/datastax_enterprise/sec/auditEnabling.html

On 1 April 2017 at 20:15, Vladimir Yudovin  wrote:

> Hi anuja,
>
> I don't thinks there is a way to do this without creating custom Cassandra
> build.
> There is mutations logs and somewhere on list was thread about parsing
> them, but I'm not sure it's what you need.
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Wed, 29 Mar 2017 07:37:15 -0400 *anuja jain  >* wrote 
>
> Hi,
> I have a cassandra cluster having a lot of keyspaces and users. I want to
> get the history of cql commands along with the username and the time at
> which the command is run.
> Also if we are running some commands from GUI tools like
> Devcenter,dbeaver, can we log those commands too? If yes, how?
>
> Thanks,
> Anuja
>
>
>