Re: Lot of GC on two nodes out of 7

2016-03-06 Thread Anishek Agarwal
@Jeff i was just trying to follow some more advice given above, I
personally still think a larger newGen heap size would be better.

@Johnathan I will post the whole logs, I have restarted the nodes with
additional changes most probably tomorrow or day after i will put out the
gc logs.

the problem still exists on two nodes. too much time spent in GC,
additionally I tried to print the state of cluster via my application to
see what is happening and i see that the node with high GC has a lot of
 "inflight Queries" -- almost 1100 and other nodes is all 0.

the cfhistograms for all nodes show the approx the same number of reads. --
so i am thinking the above phenomenon is happening since the node is
spending time in gc.

also looking at the Load Balancing policy on client its new
TokenAwarePolicy(new DCAwareRoundRobinPolicy())

if you have any other ideas please keep posting them.

thanks
anishek

On Sat, Mar 5, 2016 at 12:54 AM, Jonathan Haddad  wrote:

> Without looking at your GC logs (you never posted a gist), my assumption
> would be you're doing a lot of copying between survivor generations, and
> they're taking a long time.  You're probably also copying a lot of data to
> your old gen as a result of having full-ish survivor spaces to begin with.
>
> On Thu, Mar 3, 2016 at 10:26 PM Jeff Jirsa 
> wrote:
>
>> I’d personally would have gone the other way – if you’re seeing parnew,
>> increasing new gen instead of decreasing it should help drop (faster)
>> rather than promoting to sv/oldgen (slower) ?
>>
>>
>>
>> From: Anishek Agarwal
>> Reply-To: "user@cassandra.apache.org"
>> Date: Thursday, March 3, 2016 at 8:55 PM
>>
>> To: "user@cassandra.apache.org"
>> Subject: Re: Lot of GC on two nodes out of 7
>>
>> Hello,
>>
>> Bryan, most of the partition sizes are under 45 KB
>>
>> I have tried with concurrent_compactors : 8 for one of the nodes still no
>> improvement,
>> I have tried max_heap_Size : 8G, no improvement.
>>
>> I will try the newHeapsize of 2G though i am sure CMS will be a longer
>> then.
>>
>> Also doesn't look like i mentioned what type of GC was causing the
>> problems. On both the nodes its the ParNewGC thats taking long for each run
>> and too many runs are happening in succession.
>>
>> anishek
>>
>>
>> On Fri, Mar 4, 2016 at 5:36 AM, Bryan Cheng 
>> wrote:
>>
>>> Hi Anishek,
>>>
>>> In addition to the good advice others have given, do you notice any
>>> abnormally large partitions? What does cfhistograms report for 99%
>>> partition size? A few huge partitions will cause very disproportionate load
>>> on your cluster, including high GC.
>>>
>>> --Bryan
>>>
>>> On Wed, Mar 2, 2016 at 9:28 AM, Amit Singh F 
>>> wrote:
>>>
 Hi Anishek,



 We too faced similar problem in 2.0.14 and after doing some research we
 config few parameters in Cassandra.yaml and was able to overcome GC pauses
 . Those are :



 · memtable_flush_writers : increased from 1 to 3 as from
 tpstats output  we can see mutations dropped so it means writes are getting
 blocked, so increasing number will have those catered.

 · memtable_total_space_in_mb : Default (1/4 of heap size), can
 lowered because larger long lived objects will create pressure on HEAP, so
 its better to reduce some amount of size.

 · Concurrent_compactors : Alain righlty pointed out this i.e
 reduce it to 8. You need to try this.



 Also please check whether you have mutations drop in other nodes or not.



 Hope this helps in your cluster too.



 Regards

 Amit Singh

 *From:* Jonathan Haddad [mailto:j...@jonhaddad.com]
 *Sent:* Wednesday, March 02, 2016 9:33 PM
 *To:* user@cassandra.apache.org
 *Subject:* Re: Lot of GC on two nodes out of 7



 Can you post a gist of the output of jstat -gccause (60 seconds
 worth)?  I think it's cool you're willing to experiment with alternative
 JVM settings but I've never seen anyone use max tenuring threshold of 50
 either and I can't imagine it's helpful.  Keep in mind if your objects are
 actually reaching that threshold it means they've been copied 50x (really
 really slow) and also you're going to end up spilling your eden objects
 directly into your old gen if your survivor is full.  Considering the small
 amount of memory you're using for heap I'm really not surprised you're
 running into problems.



 I recommend G1GC + 12GB heap and just let it optimize itself for almost
 all cases with the latest JVM versions.



 On Wed, Mar 2, 2016 at 6:08 AM Alain RODRIGUEZ 
 wrote:

 It looks like you are doing a good work with this cluster and know a
 lot about JVM, that's good :-).



 our machine 

Re: How to create an additional cluster in Cassandra exclusively for Analytics Purpose

2016-03-06 Thread Jack Krupansky
I don't have any direct personal experience with Stratio. It will all
depend on your queries and your data cardinality - some queries are fine
with secondary indexes while other are quite poor. Ditto for Lucene and
Solr.

It is also worth noting that the new SASI feature of Cassandra supports
keyword and prefix/suffix search. But it doesn't support multi-column ad
hoc queries, which is what people tend to use Lucene and Solr for. So,
again, it all depends on your queries and your data cardinality.

-- Jack Krupansky

On Sun, Mar 6, 2016 at 1:29 AM, Bhuvan Rawal  wrote:

> Yes Jack, we are rolling out with Stratio right now, we will assess the
> performance benefit it yields and can go for ElasticSearch/Solr later.
>
> As per your experience how does Stratio perform vis-a-vis Secondary
> Indexes?
>
> On Sun, Mar 6, 2016 at 11:15 AM, Jack Krupansky 
> wrote:
>
>> You haven't been clear about how you intend to add Solr. You can also use
>> Stratio or Stargate for basic Lucene search if you don't want need full
>> Solr support and want to stick to open source rather than go with DSE
>> Search for Solr.
>>
>> -- Jack Krupansky
>>
>> On Sun, Mar 6, 2016 at 12:25 AM, Bhuvan Rawal 
>> wrote:
>>
>>> Thanks Sean and Nirmallaya.
>>>
>>> @Jack, We are going with DSC right now and plan to use spark and later
>>> solr over the analytics DC. The use case is to have  olap and oltp
>>> workloads separated and not intertwine them, whether it is achieved by
>>> creating a new DC or a new cluster altogether. From Nirmallaya's and Sean's
>>> answer I could understand that its easily achievable by creating a separate
>>> DC, app client will need to be made DC aware and it should not make a
>>> coordinator in dc3. And same goes for spark configuration, it should read
>>> from 3rd DC. Correct me if I'm wrong.
>>>
>>> On Mar 4, 2016 7:55 PM, "Jack Krupansky" 
>>> wrote:
>>> >
>>> > DataStax Enterprise (DSE) should be fine for three or even four data
>>> centers in the same cluster. Or are you talking about some custom Solr
>>> implementation?
>>> >
>>> > -- Jack Krupansky
>>> >
>>> > On Fri, Mar 4, 2016 at 9:21 AM,  wrote:
>>> >>
>>> >> Sure. Just add a new DC. Alter your keyspaces with a new replication
>>> factor for that DC. Run repairs on the new DC to get the data streamed.
>>> Then make sure your clients only connect to the DC(s) that they need.
>>> >>
>>> >>
>>> >>
>>> >> Separation of workloads is one of the key powers of a Cassandra
>>> cluster.
>>> >>
>>> >>
>>> >>
>>> >> You may want to look at different configurations for the analytics
>>> cluster – smaller replication factor, more memory per node, more disk per
>>> node, perhaps less vnodes. Others may chime in with their experience.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Sean Durity
>>> >>
>>> >>
>>> >>
>>> >> From: Bhuvan Rawal [mailto:bhu1ra...@gmail.com]
>>> >> Sent: Friday, March 04, 2016 3:27 AM
>>> >> To: user@cassandra.apache.org
>>> >> Subject: How to create an additional cluster in Cassandra exclusively
>>> for Analytics Purpose
>>> >>
>>> >>
>>> >>
>>> >> Hi,
>>> >>
>>> >>
>>> >>
>>> >> We would like to create an additional C* data center for batch
>>> processing using spark on CFS. We would like to limit this DC exclusively
>>> for Spark operations and would like to continue the Application Servers to
>>> continue fetching data from OLTP.
>>> >>
>>> >>
>>> >>
>>> >> Is there any way to configure the same?
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> ​
>>> >>
>>> >> Regards,
>>> >>
>>> >> Bhuvan
>>> >>
>>> >>
>>> >> 
>>> >>
>>> >> The information in this Internet Email is confidential and may be
>>> legally privileged. It is intended solely for the addressee. Access to this
>>> Email by anyone else is unauthorized. If you are not the intended
>>> recipient, any disclosure, copying, distribution or any action taken or
>>> omitted to be taken in reliance on it, is prohibited and may be unlawful.
>>> When addressed to our clients any opinions or advice contained in this
>>> Email are subject to the terms and conditions expressed in any applicable
>>> governing The Home Depot terms of business or client engagement letter. The
>>> Home Depot disclaims all responsibility and liability for the accuracy and
>>> content of this attachment and for any damages or losses arising from any
>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>>> items of a destructive nature, which may be contained in this attachment
>>> and shall not be liable for direct, indirect, consequential or special
>>> damages in connection with this e-mail message or its attachment.
>>> >
>>> >
>>>
>>
>>
>


Re: Unexplainably large reported partition sizes

2016-03-06 Thread Tom van den Berge
No, data is hardly ever deleted from this table. The cfstats conform this.
The data is also nog reinserted.
Op 5 mrt. 2016 6:20 PM schreef "DuyHai Doan" :

> Maybe tombstones ? Do you issue a lot of DELETE statements ? Or do you
> re-insert in the same partition with different TTL values ?
>
> On Sat, Mar 5, 2016 at 7:16 PM, Tom van den Berge 
> wrote:
>
>> I don't think compression can be the cause of the difference, because of
>> two reasons:
>>
>> 1) The partition size I calculated myself (3 MB) is the uncompressed
>> size, and so is the reported size (2.3 GB)
>>
>> 2) The difference is simply way too big to be explained by compression,
>> even if the calculated size would have been the compressed size. The
>> compression would be 0.125% of the original, which is not realistic. In the
>> logs, I can see that the typical compression that is achieved for this
>> table is around 80% of the original.
>>
>> Tom
>>
>> On Fri, Mar 4, 2016 at 9:48 PM, Robert Coli  wrote:
>>
>>> On Fri, Mar 4, 2016 at 5:56 AM, Tom van den Berge 
>>> wrote:
>>>
  Compacting large partition
 drillster/subscriberstats:rqtPewK-1chi0JSO595u-Q (1,470,058,292 bytes)

 This means that this single partition is about 1.4GB large. This is
 much larger that it can possibly be, because of two reasons:
   1) the partition has appr. 50K rows, each roughly 62 bytes = ~3 MB
   2) the entire table consumes appr. 500MB of disk space on the node
 containing the partition (including snapshots)

 Furthermore, nodetool cfstats tells me this:
 Space used (live): 253,928,111
 Space used (total): 253,928,111
 Compacted partition maximum bytes: 2,395,318,855
 The space used seem to match the actual size (excl. snapshots), but the
 Compacted partition maximum bytes (2,3 GB) seems to be far higher than
 possible. Does anyone know how it is possible that Cassandra reports such
 unlikely sizes?

>>>
>>> Compression is enabled by default, and compaction reports the
>>> uncompressed size.
>>>
>>> =Rob
>>>
>>>
>>
>>
>>
>> --
>> Tom van den Berge
>> Lead Software Engineer
>>
>> [image: Drillster]
>>
>> Middenburcht 136
>> 3452 MT Vleuten
>> Netherlands +31 30 755 53 30
>> www.drillster.com
>>
>> [image: Follow us on Facebook] Follow us on Facebook
>> 
>>
>
>