Re:Re: Re: Cassandra Tuning Issue

2015-12-08 Thread xutom



Dear Jack,
Thank you very much! Now we have much better performance when we insert the 
same partition keys in the same batch.

jerry


At 2015-12-07 13:08:31, "Jack Krupansky"  wrote:

If you combine inserts for multiple partition keys in the same batch you negate 
most of the effect of token-aware routing. It's best to insert only rows with 
the same partition key in a single batch. You also need to set the partition 
key for routing for the batch.


Also, RF=2 is not recommended since it does not permit quorum operations if a 
replica node is down. RF=3 is generally more appropriate.


-- Jack Krupansky


On Sun, Dec 6, 2015 at 10:27 PM, xutom  wrote:

Dear all,
Thanks for ur reply!
Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79,  my keyspace 
replication factor is 2,and I do enable the "token aware". The GC configuration 
is default for such as:
# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
And I check the gc log: gc.log.0.current, I found there is only one Full 
GC. The stop-the-world times is low.
CMS-initial-mark: 0.2747280 secs
CMS-remark: 0.3623090 secs

The insert codes in my test client are following:
String content = RandomStringUtils.randomAlphabetic(120);
cluster = Cluster
.builder()
.addContactPoint(this.seedIP)
.withCredentials("test", "test")
.withRetryPolicy(DefaultRetryPolicy.INSTANCE)
.withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy()))
.build();
session = cluster.connect("demo");
..

PreparedStatement insertPreparedStatement = session.prepare(
"   INSERT INTO teacher (id, lastname, firstname, city) 
" +
"VALUES (?, ?, ?, ?); ");

BatchStatement batch = new BatchStatement();
for (; i < max; i+=5) {
try {
batch.add(insertPreparedStatement.bind(i, "Entre Nous", 
"adsfasdfa1", content));
batch.add(insertPreparedStatement.bind(i+1, "Entre Nous", 
"adsfasdfa2", content));
batch.add(insertPreparedStatement.bind(i+2, "Entre Nous", 
"adsfasdfa3", content));
batch.add(insertPreparedStatement.bind(i+3, "Entre Nous", 
"adsfasdfa4", content));
batch.add(insertPreparedStatement.bind(i+4, "Entre Nous", 
"adsfasdfa5", content));
   
//System.out.println("the is is " + i);
session.execute(batch);

thisTimeCount += 5;
}
}





At 2015-12-07 00:40:06, "Graham Sanderson"  wrote:

What version of C* are you using; what JVM version - you showed a partial GC 
config but if that is still CMS (not G1) then you are going to have insane GC 
pauses... 


Depending on C* versions are you using on/off heap memtables and what type


Those are the sorts of issues related to fat nodes; I'd be worried about - we 
run very nicely at 20G total heap and 8G new - the rest of our 128G memory is 
disk cache/mmap and all of the off heap stuff so it doesn't go to waste


That said I think Jack is probably on the right path with overloaded 
coordinators- though you'd still expect to see CPU usage unless your timeouts 
are too low for the load, In which case the coordinator would be getting no 
responses in time and quite possibly the other nodes are just dropping the 
mutations (since they don't get to them before they know the coordinator would 
have timed out) - I forget the command to check dropped mutations off the top 
of my head but you can see it in opcenter


If you have GC problems you certainly
Expect to see GC cpu usage but depending on how long you run your tests it 
might take you a little while to run thru 40G


I'm personally not a fan off >32G (ish) heaps as you can't do compressed oops 
and also it is unrealistic for CMS ... The word is that G1 is now working ok 
with C* especially on newer C* and JDK versions, but that said it takes quite a 
lot of thru-put to require insane quantities of young gen... We are guessing 
that when we remove all our legacy thrift batch inserts we will need less - and 
as for 20G total we actually don't need that much (we dropped from 24 when we 
moved memtables off heap, and believe we can drop further)

Sent from my iPhone

On Dec 6, 2015, at 9:07 AM, Jack Krupansky  wrote:


What replication factor are you using? Even if your writes use CL.ONE, 
Cassandra will be attempting writes to the replica nodes in the background.


Are your writes "token aware"? If not, the receiving node has the overhead of 
forwarding the request to the node that owns the 

Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread Jack Krupansky
Great! Make sure to inform the C* email list as well so that others know.

-- Jack Krupansky

On Tue, Dec 8, 2015 at 7:44 AM, xutom  wrote:

>
>
> Dear Jack,
> Thank you very much! Now we have much better performance when we
> insert the same partition keys in the same batch.
>
> jerry
>
> At 2015-12-07 13:08:31, "Jack Krupansky"  wrote:
>
> If you combine inserts for multiple partition keys in the same batch you
> negate most of the effect of token-aware routing. It's best to insert only
> rows with the same partition key in a single batch. You also need to set
> the partition key for routing for the batch.
>
> Also, RF=2 is not recommended since it does not permit quorum operations
> if a replica node is down. RF=3 is generally more appropriate.
>
> -- Jack Krupansky
>
> On Sun, Dec 6, 2015 at 10:27 PM, xutom  wrote:
>
>> Dear all,
>> Thanks for ur reply!
>> Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79,  my
>> keyspace replication factor is 2,and I do enable the "token aware". The GC
>> configuration is default for such as:
>> # GC tuning options
>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>> And I check the gc log: gc.log.0.current, I found there is only one
>> Full GC. The stop-the-world times is low.
>> CMS-initial-mark: 0.2747280 secs
>> CMS-remark: 0.3623090 secs
>>
>> The insert codes in my test client are following:
>> String content = RandomStringUtils.randomAlphabetic(120);
>> cluster = Cluster
>> .builder()
>> .addContactPoint(this.seedIP)
>> .withCredentials("test", "test")
>> .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
>> .withLoadBalancingPolicy(new TokenAwarePolicy(new
>> DCAwareRoundRobinPolicy()))
>> .build();
>> session = cluster.connect("demo");
>> ..
>> PreparedStatement insertPreparedStatement = session.prepare(
>> "   INSERT INTO teacher (id, lastname, firstname,
>> city) " +
>> "VALUES (?, ?, ?, ?); ");
>>
>> BatchStatement batch = new BatchStatement();
>> for (; i < max; i+=5) {
>> try {
>> batch.add(insertPreparedStatement.bind(i, "Entre
>> Nous", "adsfasdfa1", content));
>> batch.add(insertPreparedStatement.bind(i+1, "Entre
>> Nous", "adsfasdfa2", content));
>> batch.add(insertPreparedStatement.bind(i+2, "Entre
>> Nous", "adsfasdfa3", content));
>> batch.add(insertPreparedStatement.bind(i+3, "Entre
>> Nous", "adsfasdfa4", content));
>> batch.add(insertPreparedStatement.bind(i+4, "Entre
>> Nous", "adsfasdfa5", content));
>>
>> //System.out.println("the is is " + i);
>> session.execute(batch);
>> thisTimeCount += 5;
>> }
>> }
>>
>>
>>
>> At 2015-12-07 00:40:06, "Graham Sanderson"  wrote:
>>
>> What version of C* are you using; what JVM version - you showed a partial
>> GC config but if that is still CMS (not G1) then you are going to have
>> insane GC pauses...
>>
>> Depending on C* versions are you using on/off heap memtables and what type
>>
>> Those are the sorts of issues related to fat nodes; I'd be worried about
>> - we run very nicely at 20G total heap and 8G new - the rest of our 128G
>> memory is disk cache/mmap and all of the off heap stuff so it doesn't go to
>> waste
>>
>> That said I think Jack is probably on the right path with overloaded
>> coordinators- though you'd still expect to see CPU usage unless your
>> timeouts are too low for the load, In which case the coordinator would be
>> getting no responses in time and quite possibly the other nodes are just
>> dropping the mutations (since they don't get to them before they know the
>> coordinator would have timed out) - I forget the command to check dropped
>> mutations off the top of my head but you can see it in opcenter
>>
>> If you have GC problems you certainly
>> Expect to see GC cpu usage but depending on how long you run your tests
>> it might take you a little while to run thru 40G
>>
>> I'm personally not a fan off >32G (ish) heaps as you can't do compressed
>> oops and also it is unrealistic for CMS ... The word is that G1 is now
>> working ok with C* especially on newer C* and JDK versions, but that said
>> it takes quite a lot of thru-put to require insane quantities of young
>> gen... We are guessing that when we remove all our legacy thrift batch
>> inserts we will need less - and as for 20G total we actually don't need
>> that much (we dropped from 24 when we moved memtables off heap, and believe
>> we can drop further)
>>
>> 

Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread Anuj Wadehra
Hi Jerry,


Its great that you got performance improvement. Moreover, I agree with what 
Graham said. I think that you are using extremely large Heaps with CMS and that 
too in very odd ratio..Having 40G for new gen and leaving only 20G old gen 
seems unreasonable..Its hard to believe that you are having reasonable Gc 
pauses..Please recheck..I would suggest you to test your performance with much 
smaller heap..may be 16G max heap n 4G new gen..moreover make sure that you 
apply all the recommended Production settings suggested by DataStax at 
http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html


Dont worry about wasting your memory, it will be used for OS caching and you 
can get even better performance..


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"Jack Krupansky" <jack.krupan...@gmail.com>
Date:Tue, 8 Dec, 2015 at 8:07 pm
Subject:Re: Re: Re: Cassandra Tuning Issue

Great! Make sure to inform the C* email list as well so that others know.


-- Jack Krupansky


On Tue, Dec 8, 2015 at 7:44 AM, xutom <xutom2...@126.com> wrote:



Dear Jack,
    Thank you very much! Now we have much better performance when we insert the 
same partition keys in the same batch.

jerry


At 2015-12-07 13:08:31, "Jack Krupansky" <jack.krupan...@gmail.com> wrote:

If you combine inserts for multiple partition keys in the same batch you negate 
most of the effect of token-aware routing. It's best to insert only rows with 
the same partition key in a single batch. You also need to set the partition 
key for routing for the batch.


Also, RF=2 is not recommended since it does not permit quorum operations if a 
replica node is down. RF=3 is generally more appropriate.


-- Jack Krupansky


On Sun, Dec 6, 2015 at 10:27 PM, xutom <xutom2...@126.com> wrote:

Dear all,
    Thanks for ur reply!
    Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79,  my keyspace 
replication factor is 2,and I do enable the "token aware". The GC configuration 
is default for such as:
# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
    And I check the gc log: gc.log.0.current, I found there is only one Full 
GC. The stop-the-world times is low.
CMS-initial-mark: 0.2747280 secs
CMS-remark: 0.3623090 secs

    The insert codes in my test client are following:
            String content = RandomStringUtils.randomAlphabetic(120);
            cluster = Cluster
                    .builder()
                    .addContactPoint(this.seedIP)
                    .withCredentials("test", "test")
                    .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
                    .withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy())) 
                    .build();
            session = cluster.connect("demo");
            ..

            PreparedStatement insertPreparedStatement = session.prepare(
                        "   INSERT INTO teacher (id, lastname, firstname, city) 
" +
            "VALUES (?, ?, ?, ?); ");

        BatchStatement batch = new BatchStatement();
            for (; i < max; i+=5) {
                try {
                    batch.add(insertPreparedStatement.bind(i, "Entre Nous", 
"adsfasdfa1", content));
                    batch.add(insertPreparedStatement.bind(i+1, "Entre Nous", 
"adsfasdfa2", content));
                    batch.add(insertPreparedStatement.bind(i+2, "Entre Nous", 
"adsfasdfa3", content));
                    batch.add(insertPreparedStatement.bind(i+3, "Entre Nous", 
"adsfasdfa4", content));
                    batch.add(insertPreparedStatement.bind(i+4, "Entre Nous", 
"adsfasdfa5", content));
                    
//                    System.out.println("the is is " + i);
                    session.execute(batch);

                    thisTimeCount += 5;
                }
            }




At 2015-12-07 00:40:06, "Graham Sanderson" <gra...@vast.com> wrote:

What version of C* are you using; what JVM version - you showed a partial GC 
config but if that is still CMS (not G1) then you are going to have insane GC 
pauses... 


Depending on C* versions are you using on/off heap memtables and what type


Those are the sorts of issues related to fat nodes; I'd be worried about - we 
run very nicely at 20G total heap and 8G new - the rest of our 128G memory is 
disk cache/mmap and all of the off heap stuff so it doesn't go to waste


That said I think Jack is probably on the right path with overloaded 
coordinators- though you'd still expect to see CPU usage unless your timeouts 
are too low for the load, In which case the coordinator would be getting no

Re:Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread xutom
Hi Anuj,
Thanks! I will retry now!
By the way, how to " inform the C* email list as well so that others know" as 
Jack said? I am sorry I have not do that yet.

Thanks
jerry


At 2015-12-09 01:09:07, "Anuj Wadehra" <anujw_2...@yahoo.co.in> wrote:

| Hi Jerry,


Its great that you got performance improvement. Moreover, I agree with what 
Graham said. I think that you are using extremely large Heaps with CMS and that 
too in very odd ratio..Having 40G for new gen and leaving only 20G old gen 
seems unreasonable..Its hard to believe that you are having reasonable Gc 
pauses..Please recheck..I would suggest you to test your performance with much 
smaller heap..may be 16G max heap n 4G new gen..moreover make sure that you 
apply all the recommended Production settings suggested by DataStax at 
http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html


Dont worry about wasting your memory, it will be used for OS caching and you 
can get even better performance..


Thanks
Anuj

Sent from Yahoo Mail on Android

|
From:"Jack Krupansky" <jack.krupan...@gmail.com>
Date:Tue, 8 Dec, 2015 at 8:07 pm
Subject:Re: Re: Re: Cassandra Tuning Issue


Great! Make sure to inform the C* email list as well so that others know.


-- Jack Krupansky


On Tue, Dec 8, 2015 at 7:44 AM, xutom <xutom2...@126.com> wrote:




Dear Jack,
Thank you very much! Now we have much better performance when we insert the 
same partition keys in the same batch.

jerry


At 2015-12-07 13:08:31, "Jack Krupansky" <jack.krupan...@gmail.com> wrote:

If you combine inserts for multiple partition keys in the same batch you negate 
most of the effect of token-aware routing. It's best to insert only rows with 
the same partition key in a single batch. You also need to set the partition 
key for routing for the batch.


Also, RF=2 is not recommended since it does not permit quorum operations if a 
replica node is down. RF=3 is generally more appropriate.


-- Jack Krupansky


On Sun, Dec 6, 2015 at 10:27 PM, xutom <xutom2...@126.com> wrote:

Dear all,
Thanks for ur reply!
Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79,  my keyspace 
replication factor is 2,and I do enable the "token aware". The GC configuration 
is default for such as:
# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
And I check the gc log: gc.log.0.current, I found there is only one Full 
GC. The stop-the-world times is low.
CMS-initial-mark: 0.2747280 secs
CMS-remark: 0.3623090 secs

The insert codes in my test client are following:
String content = RandomStringUtils.randomAlphabetic(120);
cluster = Cluster
.builder()
.addContactPoint(this.seedIP)
.withCredentials("test", "test")
.withRetryPolicy(DefaultRetryPolicy.INSTANCE)
.withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy()))
.build();
session = cluster.connect("demo");
..

PreparedStatement insertPreparedStatement = session.prepare(
"   INSERT INTO teacher (id, lastname, firstname, city) 
" +
"VALUES (?, ?, ?, ?); ");

BatchStatement batch = new BatchStatement();
for (; i < max; i+=5) {
try {
batch.add(insertPreparedStatement.bind(i, "Entre Nous", 
"adsfasdfa1", content));
batch.add(insertPreparedStatement.bind(i+1, "Entre Nous", 
"adsfasdfa2", content));
batch.add(insertPreparedStatement.bind(i+2, "Entre Nous", 
"adsfasdfa3", content));
batch.add(insertPreparedStatement.bind(i+3, "Entre Nous", 
"adsfasdfa4", content));
batch.add(insertPreparedStatement.bind(i+4, "Entre Nous", 
"adsfasdfa5", content));
   
//System.out.println("the is is " + i);
session.execute(batch);

thisTimeCount += 5;
}
}





At 2015-12-07 00:40:06, "Graham Sanderson" <gra...@vast.com> wrote:

What version of C* are you using; what JVM version - you showed a partial GC 
config but if that is still CMS (not G1) then you are going to have insane GC 
pauses... 


Depending on C* versions are you using on/off heap memtables and what type


Those are the sorts of issues related to fat nodes; I'd be worried about - we 
run very nicely at 20G total heap and 8G new - the rest of our 128G memory is 
disk cache/mmap and 

Re: Re:Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread Anuj Wadehra
You just need to send mail to user@cassandra.apache.org ..Everyone on mailing 
list including you will get the mail..


Anuj

Sent from Yahoo Mail on Android

From:"xutom" <xutom2...@126.com>
Date:Wed, 9 Dec, 2015 at 7:28 am
Subject:Re:Re: Re: Re: Cassandra Tuning Issue

Hi Anuj,
Thanks! I will retry now!
By the way, how to " inform the C* email list as well so that others know" as 
Jack said? I am sorry I have not do that yet. 

Thanks
jerry


At 2015-12-09 01:09:07, "Anuj Wadehra" <anujw_2...@yahoo.co.in> wrote:

Hi Jerry,


Its great that you got performance improvement. Moreover, I agree with what 
Graham said. I think that you are using extremely large Heaps with CMS and that 
too in very odd ratio..Having 40G for new gen and leaving only 20G old gen 
seems unreasonable..Its hard to believe that you are having reasonable Gc 
pauses..Please recheck..I would suggest you to test your performance with much 
smaller heap..may be 16G max heap n 4G new gen..moreover make sure that you 
apply all the recommended Production settings suggested by DataStax at 
http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html


Dont worry about wasting your memory, it will be used for OS caching and you 
can get even better performance..


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"Jack Krupansky" <jack.krupan...@gmail.com>
Date:Tue, 8 Dec, 2015 at 8:07 pm
Subject:Re: Re: Re: Cassandra Tuning Issue

Great! Make sure to inform the C* email list as well so that others know.


-- Jack Krupansky


On Tue, Dec 8, 2015 at 7:44 AM, xutom <xutom2...@126.com> wrote:



Dear Jack,
    Thank you very much! Now we have much better performance when we insert the 
same partition keys in the same batch.

jerry


At 2015-12-07 13:08:31, "Jack Krupansky" <jack.krupan...@gmail.com> wrote:

If you combine inserts for multiple partition keys in the same batch you negate 
most of the effect of token-aware routing. It's best to insert only rows with 
the same partition key in a single batch. You also need to set the partition 
key for routing for the batch.


Also, RF=2 is not recommended since it does not permit quorum operations if a 
replica node is down. RF=3 is generally more appropriate.


-- Jack Krupansky


On Sun, Dec 6, 2015 at 10:27 PM, xutom <xutom2...@126.com> wrote:

Dear all,
    Thanks for ur reply!
    Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79,  my keyspace 
replication factor is 2,and I do enable the "token aware". The GC configuration 
is default for such as:
# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
    And I check the gc log: gc.log.0.current, I found there is only one Full 
GC. The stop-the-world times is low.
CMS-initial-mark: 0.2747280 secs
CMS-remark: 0.3623090 secs

    The insert codes in my test client are following:
            String content = RandomStringUtils.randomAlphabetic(120);
            cluster = Cluster
                    .builder()
                    .addContactPoint(this.seedIP)
                    .withCredentials("test", "test")
                    .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
                    .withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy())) 
                    .build();
            session = cluster.connect("demo");
            ..

            PreparedStatement insertPreparedStatement = session.prepare(
                        "   INSERT INTO teacher (id, lastname, firstname, city) 
" +
            "VALUES (?, ?, ?, ?); ");

        BatchStatement batch = new BatchStatement();
            for (; i < max; i+=5) {
                try {
                    batch.add(insertPreparedStatement.bind(i, "Entre Nous", 
"adsfasdfa1", content));
                    batch.add(insertPreparedStatement.bind(i+1, "Entre Nous", 
"adsfasdfa2", content));
                    batch.add(insertPreparedStatement.bind(i+2, "Entre Nous", 
"adsfasdfa3", content));
                    batch.add(insertPreparedStatement.bind(i+3, "Entre Nous", 
"adsfasdfa4", content));
                    batch.add(insertPreparedStatement.bind(i+4, "Entre Nous", 
"adsfasdfa5", content));
                    
//                    System.out.println("the is is " + i);
                    session.execute(batch);

                    thisTimeCount += 5;
                }
            }




At 2015-12-07 00:40:06, "Graham Sanderson" <gra...@vast.com> wrote:

What version of C* are you using; what JVM version - you showed a partial GC 
config but if that is still CMS (not G1) then you are going to have

Re: Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread Jack Krupansky
Sorry, Jerry, my mistake - I mistakenly thought you had emailed me
directly! You had already informed the email list.

-- Jack Krupansky

On Tue, Dec 8, 2015 at 8:58 PM, xutom <xutom2...@126.com> wrote:

> Hi Anuj,
> Thanks! I will retry now!
> By the way, how to " inform the C* email list as well so that others know"
> as Jack said? I am sorry I have not do that yet.
>
> Thanks
> jerry
>
> At 2015-12-09 01:09:07, "Anuj Wadehra" <anujw_2...@yahoo.co.in> wrote:
>
> Hi Jerry,
>
> Its great that you got performance improvement. Moreover, I agree with
> what Graham said. I think that you are using extremely large Heaps with CMS
> and that too in very odd ratio..Having 40G for new gen and leaving only 20G
> old gen seems unreasonable..Its hard to believe that you are having
> reasonable Gc pauses..Please recheck..I would suggest you to test your
> performance with much smaller heap..may be 16G max heap n 4G new
> gen..moreover make sure that you apply all the recommended Production
> settings suggested by DataStax at
> http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html
>
> Dont worry about wasting your memory, it will be used for OS caching and
> you can get even better performance..
>
> Thanks
> Anuj
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
> ------
> *From*:"Jack Krupansky" <jack.krupan...@gmail.com>
> *Date*:Tue, 8 Dec, 2015 at 8:07 pm
> *Subject*:Re: Re: Re: Cassandra Tuning Issue
>
> Great! Make sure to inform the C* email list as well so that others know.
>
> -- Jack Krupansky
>
> On Tue, Dec 8, 2015 at 7:44 AM, xutom <xutom2...@126.com> wrote:
>
>>
>>
>> Dear Jack,
>> Thank you very much! Now we have much better performance when we
>> insert the same partition keys in the same batch.
>>
>> jerry
>>
>> At 2015-12-07 13:08:31, "Jack Krupansky" <jack.krupan...@gmail.com>
>> wrote:
>>
>> If you combine inserts for multiple partition keys in the same batch you
>> negate most of the effect of token-aware routing. It's best to insert only
>> rows with the same partition key in a single batch. You also need to set
>> the partition key for routing for the batch.
>>
>> Also, RF=2 is not recommended since it does not permit quorum operations
>> if a replica node is down. RF=3 is generally more appropriate.
>>
>> -- Jack Krupansky
>>
>> On Sun, Dec 6, 2015 at 10:27 PM, xutom <xutom2...@126.com> wrote:
>>
>>> Dear all,
>>> Thanks for ur reply!
>>> Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79,  my
>>> keyspace replication factor is 2,and I do enable the "token aware". The GC
>>> configuration is default for such as:
>>> # GC tuning options
>>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>>> And I check the gc log: gc.log.0.current, I found there is only one
>>> Full GC. The stop-the-world times is low.
>>> CMS-initial-mark: 0.2747280 secs
>>> CMS-remark: 0.3623090 secs
>>>
>>> The insert codes in my test client are following:
>>> String content = RandomStringUtils.randomAlphabetic(120);
>>> cluster = Cluster
>>> .builder()
>>> .addContactPoint(this.seedIP)
>>> .withCredentials("test", "test")
>>> .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
>>> .withLoadBalancingPolicy(new TokenAwarePolicy(new
>>> DCAwareRoundRobinPolicy()))
>>> .build();
>>> session = cluster.connect("demo");
>>> ..
>>> PreparedStatement insertPreparedStatement = session.prepare(
>>> "   INSERT INTO teacher (id, lastname,
>>> firstname, city) " +
>>> "VALUES (?, ?, ?, ?); ");
>>>
>>> BatchStatement batch = new BatchStatement();
>>> for (; i < max; i+=5) {
>>> try {
>>> batch.add(insertPreparedStatement.bind(i, "Entre
>>> Nous", "adsfasdfa1", content));
>>> batch.add(insertPreparedStatement.bind(i+1, "Ent

Re: Cassandra Tuning Issue

2015-12-06 Thread Jack Krupansky
What replication factor are you using? Even if your writes use CL.ONE,
Cassandra will be attempting writes to the replica nodes in the background.

Are your writes "token aware"? If not, the receiving node has the overhead
of forwarding the request to the node that owns the token for the primary
key.

For the record, Cassandra is not designed and optimized for so-called "fat
nodes". The design focus is "commodity hardware" and "distributed cluster"
(typically a dozen or more nodes.)

That said, it would be good if we had a rule of thumb for how many
simultaneous requests a node can handle, both external requests and
inter-node traffic. I think there is an open Jira to enforce a limit on
inflight requests so that nodes don't overloaded and start failing in the
middle of writes as you seem to be seeing.

-- Jack Krupansky

On Sun, Dec 6, 2015 at 9:29 AM, jerry  wrote:

> Dear All,
>
> Now I have a 4 nodes Cassandra cluster, and I want to know the highest
> performance of my Cassandra cluster. I write a JAVA client to batch insert
> datas into ALL 4 nodes Cassandra, when I start less than 30 subthreads in
> my client applications to insert datas into cassandra, it will be ok for
> everything, but when I start more than 80 or 100 subthreads in my client
> applications, there will be too much timeout Exceptions (Such as: Cassandra
> timeout during write query at consistency ONE (1 replica were required but
> only 0 acknowledged the write)). And no matter how many subthreads or even
> I start multiple clients with multiple subthreads on different computers, I
> can get the highest performance for about 6 - 8 TPS. By the way,
> each row I insert into cassandra is about 130 Bytes.
> My 4 nodes of Cassandra is :
> CPU: 4*15
> Memory: 512G
> Disk: flash card (only one disk but better than SSD)
> My cassandra configurations are:
> MAX_HEAP_SIZE: 60G
> NEW_HEAP_SIZE: 40G
>
> When I insert datas into my cassandra cluster, each nodes has NOT
> reached bottleneck such as CPU or Memory or Disk. Each of the three main
> hardwares is idle。So I think maybe there is something wrong about my
> configuration of cassandra cluster. Can somebody please help me to My
> Cassandra Tuning? Thanks in advances!
>


Cassandra Tuning Issue

2015-12-06 Thread jerry
Dear All,

Now I have a 4 nodes Cassandra cluster, and I want to know the highest 
performance of my Cassandra cluster. I write a JAVA client to batch insert 
datas into ALL 4 nodes Cassandra, when I start less than 30 subthreads in my 
client applications to insert datas into cassandra, it will be ok for 
everything, but when I start more than 80 or 100 subthreads in my client 
applications, there will be too much timeout Exceptions (Such as: Cassandra 
timeout during write query at consistency ONE (1 replica were required but only 
0 acknowledged the write)). And no matter how many subthreads or even I start 
multiple clients with multiple subthreads on different computers, I can get the 
highest performance for about 6 - 8 TPS. By the way, each row I insert 
into cassandra is about 130 Bytes.
My 4 nodes of Cassandra is : 
CPU: 4*15 
Memory: 512G
Disk: flash card (only one disk but better than SSD)
My cassandra configurations are:
MAX_HEAP_SIZE: 60G
NEW_HEAP_SIZE: 40G

When I insert datas into my cassandra cluster, each nodes has NOT reached 
bottleneck such as CPU or Memory or Disk. Each of the three main hardwares is 
idle。So I think maybe there is something wrong about my configuration of 
cassandra cluster. Can somebody please help me to My Cassandra Tuning? Thanks 
in advances!


Re: Cassandra Tuning Issue

2015-12-06 Thread Graham Sanderson
What version of C* are you using; what JVM version - you showed a partial GC 
config but if that is still CMS (not G1) then you are going to have insane GC 
pauses... 

Depending on C* versions are you using on/off heap memtables and what type

Those are the sorts of issues related to fat nodes; I'd be worried about - we 
run very nicely at 20G total heap and 8G new - the rest of our 128G memory is 
disk cache/mmap and all of the off heap stuff so it doesn't go to waste

That said I think Jack is probably on the right path with overloaded 
coordinators- though you'd still expect to see CPU usage unless your timeouts 
are too low for the load, In which case the coordinator would be getting no 
responses in time and quite possibly the other nodes are just dropping the 
mutations (since they don't get to them before they know the coordinator would 
have timed out) - I forget the command to check dropped mutations off the top 
of my head but you can see it in opcenter

If you have GC problems you certainly
Expect to see GC cpu usage but depending on how long you run your tests it 
might take you a little while to run thru 40G

I'm personally not a fan off >32G (ish) heaps as you can't do compressed oops 
and also it is unrealistic for CMS ... The word is that G1 is now working ok 
with C* especially on newer C* and JDK versions, but that said it takes quite a 
lot of thru-put to require insane quantities of young gen... We are guessing 
that when we remove all our legacy thrift batch inserts we will need less - and 
as for 20G total we actually don't need that much (we dropped from 24 when we 
moved memtables off heap, and believe we can drop further)

Sent from my iPhone

> On Dec 6, 2015, at 9:07 AM, Jack Krupansky  wrote:
> 
> What replication factor are you using? Even if your writes use CL.ONE, 
> Cassandra will be attempting writes to the replica nodes in the background.
> 
> Are your writes "token aware"? If not, the receiving node has the overhead of 
> forwarding the request to the node that owns the token for the primary key.
> 
> For the record, Cassandra is not designed and optimized for so-called "fat 
> nodes". The design focus is "commodity hardware" and "distributed cluster" 
> (typically a dozen or more nodes.)
> 
> That said, it would be good if we had a rule of thumb for how many 
> simultaneous requests a node can handle, both external requests and 
> inter-node traffic. I think there is an open Jira to enforce a limit on 
> inflight requests so that nodes don't overloaded and start failing in the 
> middle of writes as you seem to be seeing.
> 
> -- Jack Krupansky
> 
>> On Sun, Dec 6, 2015 at 9:29 AM, jerry  wrote:
>> Dear All,
>> 
>> Now I have a 4 nodes Cassandra cluster, and I want to know the highest 
>> performance of my Cassandra cluster. I write a JAVA client to batch insert 
>> datas into ALL 4 nodes Cassandra, when I start less than 30 subthreads in my 
>> client applications to insert datas into cassandra, it will be ok for 
>> everything, but when I start more than 80 or 100 subthreads in my client 
>> applications, there will be too much timeout Exceptions (Such as: Cassandra 
>> timeout during write query at consistency ONE (1 replica were required but 
>> only 0 acknowledged the write)). And no matter how many subthreads or even I 
>> start multiple clients with multiple subthreads on different computers, I 
>> can get the highest performance for about 6 - 8 TPS. By the way, 
>> each row I insert into cassandra is about 130 Bytes.
>> My 4 nodes of Cassandra is :
>> CPU: 4*15
>> Memory: 512G
>> Disk: flash card (only one disk but better than SSD)
>> My cassandra configurations are:
>> MAX_HEAP_SIZE: 60G
>> NEW_HEAP_SIZE: 40G
>> 
>> When I insert datas into my cassandra cluster, each nodes has NOT 
>> reached bottleneck such as CPU or Memory or Disk. Each of the three main 
>> hardwares is idle。So I think maybe there is something wrong about my 
>> configuration of cassandra cluster. Can somebody please help me to My 
>> Cassandra Tuning? Thanks in advances!
> 


Re:Re: Cassandra Tuning Issue

2015-12-06 Thread xutom
Dear all,
Thanks for ur reply!
Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79,  my keyspace 
replication factor is 2,and I do enable the "token aware". The GC configuration 
is default for such as:
# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
And I check the gc log: gc.log.0.current, I found there is only one Full 
GC. The stop-the-world times is low.
CMS-initial-mark: 0.2747280 secs
CMS-remark: 0.3623090 secs

The insert codes in my test client are following:
String content = RandomStringUtils.randomAlphabetic(120);
cluster = Cluster
.builder()
.addContactPoint(this.seedIP)
.withCredentials("test", "test")
.withRetryPolicy(DefaultRetryPolicy.INSTANCE)
.withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy()))
.build();
session = cluster.connect("demo");
..

PreparedStatement insertPreparedStatement = session.prepare(
"   INSERT INTO teacher (id, lastname, firstname, city) 
" +
"VALUES (?, ?, ?, ?); ");

BatchStatement batch = new BatchStatement();
for (; i < max; i+=5) {
try {
batch.add(insertPreparedStatement.bind(i, "Entre Nous", 
"adsfasdfa1", content));
batch.add(insertPreparedStatement.bind(i+1, "Entre Nous", 
"adsfasdfa2", content));
batch.add(insertPreparedStatement.bind(i+2, "Entre Nous", 
"adsfasdfa3", content));
batch.add(insertPreparedStatement.bind(i+3, "Entre Nous", 
"adsfasdfa4", content));
batch.add(insertPreparedStatement.bind(i+4, "Entre Nous", 
"adsfasdfa5", content));
   
//System.out.println("the is is " + i);
session.execute(batch);

thisTimeCount += 5;
}
}





At 2015-12-07 00:40:06, "Graham Sanderson"  wrote:

What version of C* are you using; what JVM version - you showed a partial GC 
config but if that is still CMS (not G1) then you are going to have insane GC 
pauses... 


Depending on C* versions are you using on/off heap memtables and what type


Those are the sorts of issues related to fat nodes; I'd be worried about - we 
run very nicely at 20G total heap and 8G new - the rest of our 128G memory is 
disk cache/mmap and all of the off heap stuff so it doesn't go to waste


That said I think Jack is probably on the right path with overloaded 
coordinators- though you'd still expect to see CPU usage unless your timeouts 
are too low for the load, In which case the coordinator would be getting no 
responses in time and quite possibly the other nodes are just dropping the 
mutations (since they don't get to them before they know the coordinator would 
have timed out) - I forget the command to check dropped mutations off the top 
of my head but you can see it in opcenter


If you have GC problems you certainly
Expect to see GC cpu usage but depending on how long you run your tests it 
might take you a little while to run thru 40G


I'm personally not a fan off >32G (ish) heaps as you can't do compressed oops 
and also it is unrealistic for CMS ... The word is that G1 is now working ok 
with C* especially on newer C* and JDK versions, but that said it takes quite a 
lot of thru-put to require insane quantities of young gen... We are guessing 
that when we remove all our legacy thrift batch inserts we will need less - and 
as for 20G total we actually don't need that much (we dropped from 24 when we 
moved memtables off heap, and believe we can drop further)

Sent from my iPhone

On Dec 6, 2015, at 9:07 AM, Jack Krupansky  wrote:


What replication factor are you using? Even if your writes use CL.ONE, 
Cassandra will be attempting writes to the replica nodes in the background.


Are your writes "token aware"? If not, the receiving node has the overhead of 
forwarding the request to the node that owns the token for the primary key.


For the record, Cassandra is not designed and optimized for so-called "fat 
nodes". The design focus is "commodity hardware" and "distributed cluster" 
(typically a dozen or more nodes.)


That said, it would be good if we had a rule of thumb for how many simultaneous 
requests a node can handle, both external requests and inter-node traffic. I 
think there is an open Jira to enforce a limit on inflight requests so that 
nodes don't overloaded and start failing in the middle of writes as you seem to 
be seeing.


-- Jack Krupansky


On Sun, Dec 6, 2015 at 9:29 AM, jerry  wrote:
Dear All,

Now I have a 4 nodes Cassandra cluster, and I want to know the highest 

Re: Re: Cassandra Tuning Issue

2015-12-06 Thread Jack Krupansky
If you combine inserts for multiple partition keys in the same batch you
negate most of the effect of token-aware routing. It's best to insert only
rows with the same partition key in a single batch. You also need to set
the partition key for routing for the batch.

Also, RF=2 is not recommended since it does not permit quorum operations if
a replica node is down. RF=3 is generally more appropriate.

-- Jack Krupansky

On Sun, Dec 6, 2015 at 10:27 PM, xutom  wrote:

> Dear all,
> Thanks for ur reply!
> Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79,  my
> keyspace replication factor is 2,and I do enable the "token aware". The GC
> configuration is default for such as:
> # GC tuning options
> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
> And I check the gc log: gc.log.0.current, I found there is only one
> Full GC. The stop-the-world times is low.
> CMS-initial-mark: 0.2747280 secs
> CMS-remark: 0.3623090 secs
>
> The insert codes in my test client are following:
> String content = RandomStringUtils.randomAlphabetic(120);
> cluster = Cluster
> .builder()
> .addContactPoint(this.seedIP)
> .withCredentials("test", "test")
> .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
> .withLoadBalancingPolicy(new TokenAwarePolicy(new
> DCAwareRoundRobinPolicy()))
> .build();
> session = cluster.connect("demo");
> ..
> PreparedStatement insertPreparedStatement = session.prepare(
> "   INSERT INTO teacher (id, lastname, firstname,
> city) " +
> "VALUES (?, ?, ?, ?); ");
>
> BatchStatement batch = new BatchStatement();
> for (; i < max; i+=5) {
> try {
> batch.add(insertPreparedStatement.bind(i, "Entre
> Nous", "adsfasdfa1", content));
> batch.add(insertPreparedStatement.bind(i+1, "Entre
> Nous", "adsfasdfa2", content));
> batch.add(insertPreparedStatement.bind(i+2, "Entre
> Nous", "adsfasdfa3", content));
> batch.add(insertPreparedStatement.bind(i+3, "Entre
> Nous", "adsfasdfa4", content));
> batch.add(insertPreparedStatement.bind(i+4, "Entre
> Nous", "adsfasdfa5", content));
>
> //System.out.println("the is is " + i);
> session.execute(batch);
> thisTimeCount += 5;
> }
> }
>
>
>
> At 2015-12-07 00:40:06, "Graham Sanderson"  wrote:
>
> What version of C* are you using; what JVM version - you showed a partial
> GC config but if that is still CMS (not G1) then you are going to have
> insane GC pauses...
>
> Depending on C* versions are you using on/off heap memtables and what type
>
> Those are the sorts of issues related to fat nodes; I'd be worried about -
> we run very nicely at 20G total heap and 8G new - the rest of our 128G
> memory is disk cache/mmap and all of the off heap stuff so it doesn't go to
> waste
>
> That said I think Jack is probably on the right path with overloaded
> coordinators- though you'd still expect to see CPU usage unless your
> timeouts are too low for the load, In which case the coordinator would be
> getting no responses in time and quite possibly the other nodes are just
> dropping the mutations (since they don't get to them before they know the
> coordinator would have timed out) - I forget the command to check dropped
> mutations off the top of my head but you can see it in opcenter
>
> If you have GC problems you certainly
> Expect to see GC cpu usage but depending on how long you run your tests it
> might take you a little while to run thru 40G
>
> I'm personally not a fan off >32G (ish) heaps as you can't do compressed
> oops and also it is unrealistic for CMS ... The word is that G1 is now
> working ok with C* especially on newer C* and JDK versions, but that said
> it takes quite a lot of thru-put to require insane quantities of young
> gen... We are guessing that when we remove all our legacy thrift batch
> inserts we will need less - and as for 20G total we actually don't need
> that much (we dropped from 24 when we moved memtables off heap, and believe
> we can drop further)
>
> Sent from my iPhone
>
> On Dec 6, 2015, at 9:07 AM, Jack Krupansky 
> wrote:
>
> What replication factor are you using? Even if your writes use CL.ONE,
> Cassandra will be attempting writes to the replica nodes in the background.
>
> Are your writes "token aware"? If not, the receiving node has the overhead
> of forwarding the request to the node that owns the token for the primary
> key.
>
> For the record, Cassandra is not designed and optimized for so-called "fat
>