Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Tyler Hobbs
>
> the program use datastax driver 2.1.8 and use 5 thread to insert data to
> cassandra on the same machine


The client with five threads is probably your bottleneck.  Try running the
cassandra stress tool for comparison.  You should see at least double the
throughput.

On Thu, Nov 5, 2015 at 9:56 AM, Eric Stevens  wrote:

> > 512G memory , 128core cpu
>
> This seems dramatically oversized for a Cassandra node.  You'd do *much* 
> better
> to have a much larger cluster of much smaller nodes.
>
>
> On Thu, Nov 5, 2015 at 8:25 AM Jack Krupansky 
> wrote:
>
>> I don't know what current numbers are, but last year the idea of getting
>> 1 million writes per second on a 96 node cluster was considered a
>> reasonable achievement. That would be roughly 10,000 writes per second per
>> node and you are getting twice that.
>>
>> See:
>> http://www.datastax.com/1-million-writes
>>
>> Or this Google test which hit 1 million writes per second with 330 nodes,
>> which would be roughly 3,000 writes per second per node:
>>
>> http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html
>>
>> So, is your question why your throughput is so good or are you
>> disappointed that it wasn't better?
>>
>> Cassandra is designed for clusters with lots of nodes, so if you want to
>> get an accurate measure of per-node performance you need to test with a
>> reasonable number of nodes and then divide aggregate performance by the
>> number of nodes, not test a single node alone. In short, testing a single
>> node in isolation is not a recommended approach to testing Cassandra
>> performance.
>>
>>
>> -- Jack Krupansky
>>
>> On Thu, Nov 5, 2015 at 9:05 AM, 郝加来  wrote:
>>
>>> hi
>>> veryone
>>> i setup cassandra 2.2.3 on a node , the machine 's environment is
>>> openjdk-1.8.0 , 512G memory , 128core cpu , 3T ssd .
>>> the token num is 256 on a node , the program use datastax driver 2.1.8
>>> and use 5 thread to insert data to cassandra on the same machine , the data
>>> 's capcity is 6G  and 1157000 line .
>>>
>>> why is the throughput 2/s on the node ?
>>>
>>>
>>> # Per-thread stack size.
>>>
>>> JVM_OPTS="$JVM_OPTS -Xss512k"
>>>
>>>
>>>
>>> # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"
>>>
>>>
>>>
>>> # GC tuning options
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>>>
>>> JVM_OPTS="$JVM_OPTS
>>> -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"
>>>
>>>
>>>
>>> memtable_heap_space_in_mb: 1024
>>>
>>> memtable_offheap_space_in_mb: 10240
>>>
>>> memtable_cleanup_threshold: 0.55
>>>
>>> memtable_allocation_type: heap_buffers
>>>
>>>
>>>
>>>
>>> 以上
>>> 谢谢
>>> --
>>>
>>> *郝加来*
>>>
>>> 金融华东事业部
>>>
>>> 东软集团股份有限公司
>>> 上海市闵行区紫月路1000号东软软件园
>>> Postcode:200241
>>> Tel:(86 21) 33578591
>>> Fax:(86 21) *23025565-111*
>>> Mobile:13764970711
>>> Email:ha...@neusoft.com
>>> Http://www.neusoft.com 
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---
>>> Confidentiality Notice: The information contained in this e-mail and any
>>> accompanying attachment(s)
>>> is intended only for the use of the intended recipient and may be
>>> confidential and/or privileged of
>>> Neusoft Corporation, its subsidiaries and/or its affiliates. If any
>>> reader of this communication is
>>> not the intended recipient, unauthorized use, forwarding, printing,
>>> storing, disclosure or copying
>>> is strictly prohibited, and may be unlawful.If you have received this
>>> communication in error,please
>>> immediately notify the sender by return e-mail, and delete the original
>>> message and all copies from
>>> your system. Thank you.
>>>
>>> ---
>>>
>>
>>


-- 
Tyler Hobbs
DataStax 


Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Eric Stevens
> 512G memory , 128core cpu

This seems dramatically oversized for a Cassandra node.  You'd do *much* better
to have a much larger cluster of much smaller nodes.


On Thu, Nov 5, 2015 at 8:25 AM Jack Krupansky 
wrote:

> I don't know what current numbers are, but last year the idea of getting 1
> million writes per second on a 96 node cluster was considered a reasonable
> achievement. That would be roughly 10,000 writes per second per node and
> you are getting twice that.
>
> See:
> http://www.datastax.com/1-million-writes
>
> Or this Google test which hit 1 million writes per second with 330 nodes,
> which would be roughly 3,000 writes per second per node:
>
> http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html
>
> So, is your question why your throughput is so good or are you
> disappointed that it wasn't better?
>
> Cassandra is designed for clusters with lots of nodes, so if you want to
> get an accurate measure of per-node performance you need to test with a
> reasonable number of nodes and then divide aggregate performance by the
> number of nodes, not test a single node alone. In short, testing a single
> node in isolation is not a recommended approach to testing Cassandra
> performance.
>
>
> -- Jack Krupansky
>
> On Thu, Nov 5, 2015 at 9:05 AM, 郝加来  wrote:
>
>> hi
>> veryone
>> i setup cassandra 2.2.3 on a node , the machine 's environment is
>> openjdk-1.8.0 , 512G memory , 128core cpu , 3T ssd .
>> the token num is 256 on a node , the program use datastax driver 2.1.8
>> and use 5 thread to insert data to cassandra on the same machine , the data
>> 's capcity is 6G  and 1157000 line .
>>
>> why is the throughput 2/s on the node ?
>>
>>
>> # Per-thread stack size.
>>
>> JVM_OPTS="$JVM_OPTS -Xss512k"
>>
>>
>>
>> # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
>>
>> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"
>>
>>
>>
>> # GC tuning options
>>
>> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
>>
>> JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
>>
>> JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
>>
>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>>
>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>>
>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>>
>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"
>>
>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
>>
>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
>>
>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>>
>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>>
>> JVM_OPTS="$JVM_OPTS
>> -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
>>
>> JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"
>>
>>
>>
>> memtable_heap_space_in_mb: 1024
>>
>> memtable_offheap_space_in_mb: 10240
>>
>> memtable_cleanup_threshold: 0.55
>>
>> memtable_allocation_type: heap_buffers
>>
>>
>>
>>
>> 以上
>> 谢谢
>> --
>>
>> *郝加来*
>>
>> 金融华东事业部
>>
>> 东软集团股份有限公司
>> 上海市闵行区紫月路1000号东软软件园
>> Postcode:200241
>> Tel:(86 21) 33578591
>> Fax:(86 21) *23025565-111*
>> Mobile:13764970711
>> Email:ha...@neusoft.com
>> Http://www.neusoft.com 
>>
>>
>>
>>
>>
>>
>>
>> ---
>> Confidentiality Notice: The information contained in this e-mail and any
>> accompanying attachment(s)
>> is intended only for the use of the intended recipient and may be
>> confidential and/or privileged of
>> Neusoft Corporation, its subsidiaries and/or its affiliates. If any
>> reader of this communication is
>> not the intended recipient, unauthorized use, forwarding, printing,
>> storing, disclosure or copying
>> is strictly prohibited, and may be unlawful.If you have received this
>> communication in error,please
>> immediately notify the sender by return e-mail, and delete the original
>> message and all copies from
>> your system. Thank you.
>>
>> ---
>>
>
>


Re: Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread 郝加来
hi ,
the same partition key on a single node are atomic and isolated?
sorry,i don't read the source code ,but i think the cassandra is single thread 
on the same keyspace ,not he partition key, and the same keyspace is atomic and 
isolated .
because, when the client insert data into table a and b ,the throught  on table 
' a is drop , and a + b 'sum is the 2/s .




郝加来

From: Graham Sanderson
Date: 2015-11-06 11:06
To: user@cassandra.apache.org
Subject: Re: why cassanra max is 2/s on a node ?
Agreed too. It also matters what you are inserting… if you are inserting to the 
same (or small set of) partition key(s) you will be limited because writes to 
the same partition key on a single node are atomic and isolated.


On Nov 5, 2015, at 8:49 PM, Venkatesh Arivazhagan <venkey.a...@gmail.com> wrote:


I agree with Tyler! Have you tries increasing the the client threads from 5 to 
a higher number?
On Nov 5, 2015 6:46 PM, "郝加来" <ha...@neusoft.com> wrote:

right ,
but wo want a node 's throught is above million , so if the system hava fifty 
table , a single table can achive 2/s .





郝加来

From: Eric Stevens
Date: 2015-11-05 23:56
To: user@cassandra.apache.org
Subject: Re: why cassanra max is 2/s on a node ?
> 512G memory , 128core cpu 


This seems dramatically oversized for a Cassandra node.  You'd do much better 
to have a much larger cluster of much smaller nodes.




On Thu, Nov 5, 2015 at 8:25 AM Jack Krupansky <jack.krupan...@gmail.com> wrote:

I don't know what current numbers are, but last year the idea of getting 1 
million writes per second on a 96 node cluster was considered a reasonable 
achievement. That would be roughly 10,000 writes per second per node and you 
are getting twice that. 


See:
http://www.datastax.com/1-million-writes



Or this Google test which hit 1 million writes per second with 330 nodes, which 
would be roughly 3,000 writes per second per node:
http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html



So, is your question why your throughput is so good or are you disappointed 
that it wasn't better?


Cassandra is designed for clusters with lots of nodes, so if you want to get an 
accurate measure of per-node performance you need to test with a reasonable 
number of nodes and then divide aggregate performance by the number of nodes, 
not test a single node alone. In short, testing a single node in isolation is 
not a recommended approach to testing Cassandra performance.




-- Jack Krupansky


On Thu, Nov 5, 2015 at 9:05 AM, 郝加来 <ha...@neusoft.com> wrote:

hi
veryone 
i setup cassandra 2.2.3 on a node , the machine 's environment is openjdk-1.8.0 
, 512G memory , 128core cpu , 3T ssd .
the token num is 256 on a node , the program use datastax driver 2.1.8 and use 
5 thread to insert data to cassandra on the same machine , the data 's capcity 
is 6G  and 1157000 line .

why is the throughput 2/s on the node ?

# Per-thread stack size.
JVM_OPTS="$JVM_OPTS -Xss512k"

# Larger interned string table, for gossip's benefit (CASSANDRA-6410)
JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"

# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"  
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
JVM_OPTS="$JVM_OPTS -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"

memtable_heap_space_in_mb: 1024
memtable_offheap_space_in_mb: 10240
memtable_cleanup_threshold: 0.55 
memtable_allocation_type: heap_buffers 



以上
谢谢



 
郝加来

金融华东事业部
<东软20周年邮件签名logo(1(11-06-10-44-31).jpg>

东软集团股份有限公司
上海市闵行区紫月路1000号东软软件园
Postcode:200241
Tel:(86 21) 33578591
Fax:(86 21) 23025565-111
Mobile:13764970711
Email:ha...@neusoft.com
Http://www.neusoft.com








---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may b

Re: Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Venkatesh Arivazhagan
I agree with Tyler! Have you tries increasing the the client threads from 5
to a higher number?
On Nov 5, 2015 6:46 PM, "郝加来" <ha...@neusoft.com> wrote:

> right ,
> but wo want a node 's throught is above million , so if the system hava
> fifty table , a single table can achive 2/s .
>
>
> --
> 郝加来
>
> *From:* Eric Stevens <migh...@gmail.com>
> *Date:* 2015-11-05 23:56
> *To:* user@cassandra.apache.org
> *Subject:* Re: why cassanra max is 2/s on a node ?
> > 512G memory , 128core cpu
>
> This seems dramatically oversized for a Cassandra node.  You'd do *much* 
> better
> to have a much larger cluster of much smaller nodes.
>
>
> On Thu, Nov 5, 2015 at 8:25 AM Jack Krupansky <jack.krupan...@gmail.com>
> wrote:
>
>> I don't know what current numbers are, but last year the idea of getting
>> 1 million writes per second on a 96 node cluster was considered a
>> reasonable achievement. That would be roughly 10,000 writes per second per
>> node and you are getting twice that.
>>
>> See:
>> http://www.datastax.com/1-million-writes
>>
>> Or this Google test which hit 1 million writes per second with 330 nodes,
>> which would be roughly 3,000 writes per second per node:
>>
>> http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html
>>
>> So, is your question why your throughput is so good or are you
>> disappointed that it wasn't better?
>>
>> Cassandra is designed for clusters with lots of nodes, so if you want to
>> get an accurate measure of per-node performance you need to test with a
>> reasonable number of nodes and then divide aggregate performance by the
>> number of nodes, not test a single node alone. In short, testing a single
>> node in isolation is not a recommended approach to testing Cassandra
>> performance.
>>
>>
>> -- Jack Krupansky
>>
>> On Thu, Nov 5, 2015 at 9:05 AM, 郝加来 <ha...@neusoft.com> wrote:
>>
>>> hi
>>> veryone
>>> i setup cassandra 2.2.3 on a node , the machine 's environment is
>>> openjdk-1.8.0 , 512G memory , 128core cpu , 3T ssd .
>>> the token num is 256 on a node , the program use datastax driver 2.1.8
>>> and use 5 thread to insert data to cassandra on the same machine , the data
>>> 's capcity is 6G  and 1157000 line .
>>>
>>> why is the throughput 2/s on the node ?
>>>
>>>
>>> # Per-thread stack size.
>>>
>>> JVM_OPTS="$JVM_OPTS -Xss512k"
>>>
>>>
>>>
>>> # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"
>>>
>>>
>>>
>>> # GC tuning options
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>>>
>>> JVM_OPTS="$JVM_OPTS
>>> -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
>>>
>>> JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"
>>>
>>>
>>>
>>> memtable_heap_space_in_mb: 1024
>>>
>>> memtable_offheap_space_in_mb: 10240
>>>
>>> memtable_cleanup_threshold: 0.55
>>>
>>> memtable_allocation_type: heap_buffers
>>>
>>>
>>>
>>>
>>> 以上
>>> 谢谢
>>> --
>>>
>>> *郝加来*
>>>
>>> 金融华东事业部
>>>
>>> 东软集团股份有限公司
>>> 上海市闵行区紫月路1000号东软软件园
>>> Postcode:200241
>>> Tel:(86 21) 33578591
>>> Fax:(86 21) *23025565-111*
>>> Mobile:13764970711
>>> Email:ha...@neusoft.com
>>> Http://www.neusoft.com <http://www.neusoft.com/>
>>

Re: Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread 郝加来
Cassandra is designed for clusters with lots of nodes, 
right ,
i know it , but a single node 's throughput  only 2/s ? and all table 's 
total throughput  is 2 /s ?
so i think it is a single thread to deal the all table's command .
normal , a database 's all table 's total throughput is above 200,000 /s .





郝加来

From: Jack Krupansky
Date: 2015-11-05 23:24
To: user@cassandra.apache.org
Subject: Re: why cassanra max is 2/s on a node ?
I don't know what current numbers are, but last year the idea of getting 1 
million writes per second on a 96 node cluster was considered a reasonable 
achievement. That would be roughly 10,000 writes per second per node and you 
are getting twice that.


See:
http://www.datastax.com/1-million-writes



Or this Google test which hit 1 million writes per second with 330 nodes, which 
would be roughly 3,000 writes per second per node:
http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html



So, is your question why your throughput is so good or are you disappointed 
that it wasn't better?


Cassandra is designed for clusters with lots of nodes, so if you want to get an 
accurate measure of per-node performance you need to test with a reasonable 
number of nodes and then divide aggregate performance by the number of nodes, 
not test a single node alone. In short, testing a single node in isolation is 
not a recommended approach to testing Cassandra performance.




-- Jack Krupansky


On Thu, Nov 5, 2015 at 9:05 AM, 郝加来 <ha...@neusoft.com> wrote:

hi
veryone 
i setup cassandra 2.2.3 on a node , the machine 's environment is openjdk-1.8.0 
, 512G memory , 128core cpu , 3T ssd .
the token num is 256 on a node , the program use datastax driver 2.1.8 and use 
5 thread to insert data to cassandra on the same machine , the data 's capcity 
is 6G  and 1157000 line .

why is the throughput 2/s on the node ?

# Per-thread stack size.
JVM_OPTS="$JVM_OPTS -Xss512k"

# Larger interned string table, for gossip's benefit (CASSANDRA-6410)
JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"

# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"  
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
JVM_OPTS="$JVM_OPTS -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"

memtable_heap_space_in_mb: 1024
memtable_offheap_space_in_mb: 10240
memtable_cleanup_threshold: 0.55 
memtable_allocation_type: heap_buffers 



以上
谢谢



 
郝加来

金融华东事业部


东软集团股份有限公司
上海市闵行区紫月路1000号东软软件园
Postcode:200241
Tel:(86 21) 33578591
Fax:(86 21) 23025565-111
Mobile:13764970711
Email:ha...@neusoft.com
Http://www.neusoft.com








---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---


---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please
immediately notify the sender by return e-mail, and delete the original message 
and all copies from
your system. Thank you.
---


Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Graham Sanderson
Agreed too. It also matters what you are inserting… if you are inserting to the 
same (or small set of) partition key(s) you will be limited because writes to 
the same partition key on a single node are atomic and isolated.

> On Nov 5, 2015, at 8:49 PM, Venkatesh Arivazhagan <venkey.a...@gmail.com> 
> wrote:
> 
> I agree with Tyler! Have you tries increasing the the client threads from 5 
> to a higher number?
> 
> On Nov 5, 2015 6:46 PM, "郝加来" <ha...@neusoft.com <mailto:ha...@neusoft.com>> 
> wrote:
> right ,
> but wo want a node 's throught is above million , so if the system hava fifty 
> table , a single table can achive 2/s .
>  
>  
> 郝加来
>  
> From: Eric Stevens <mailto:migh...@gmail.com>
> Date: 2015-11-05 23:56
> To: user@cassandra.apache.org <mailto:user@cassandra.apache.org>
> Subject: Re: why cassanra max is 2/s on a node ?
> > 512G memory , 128core cpu
> 
> This seems dramatically oversized for a Cassandra node.  You'd do much better 
> to have a much larger cluster of much smaller nodes.
> 
> 
> On Thu, Nov 5, 2015 at 8:25 AM Jack Krupansky <jack.krupan...@gmail.com 
> <mailto:jack.krupan...@gmail.com>> wrote:
> I don't know what current numbers are, but last year the idea of getting 1 
> million writes per second on a 96 node cluster was considered a reasonable 
> achievement. That would be roughly 10,000 writes per second per node and you 
> are getting twice that.
> 
> See:
> http://www.datastax.com/1-million-writes 
> <http://www.datastax.com/1-million-writes>
> 
> Or this Google test which hit 1 million writes per second with 330 nodes, 
> which would be roughly 3,000 writes per second per node:
> http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html
>  
> <http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html>
> 
> So, is your question why your throughput is so good or are you disappointed 
> that it wasn't better?
> 
> Cassandra is designed for clusters with lots of nodes, so if you want to get 
> an accurate measure of per-node performance you need to test with a 
> reasonable number of nodes and then divide aggregate performance by the 
> number of nodes, not test a single node alone. In short, testing a single 
> node in isolation is not a recommended approach to testing Cassandra 
> performance.
> 
> 
> -- Jack Krupansky
> 
> On Thu, Nov 5, 2015 at 9:05 AM, 郝加来 <ha...@neusoft.com 
> <mailto:ha...@neusoft.com>> wrote:
> hi
> veryone
> i setup cassandra 2.2.3 on a node , the machine 's environment is 
> openjdk-1.8.0 , 512G memory , 128core cpu , 3T ssd .
> the token num is 256 on a node , the program use datastax driver 2.1.8 and 
> use 5 thread to insert data to cassandra on the same machine , the data 's 
> capcity is 6G  and 1157000 line .
>  
> why is the throughput 2/s on the node ?
>  
> # Per-thread stack size.
> JVM_OPTS="$JVM_OPTS -Xss512k"
>  
> # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"
>  
> # GC tuning options
> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
> JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
> JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4" 
> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
> JVM_OPTS="$JVM_OPTS -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
> JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"
>  
> memtable_heap_space_in_mb: 1024
> memtable_offheap_space_in_mb: 10240
> memtable_cleanup_threshold: 0.55
> memtable_allocation_type: heap_buffers
>  
>  
>  
> 以上
> 谢谢
>  
> 郝加来
>  
> 金融华东事业部
> <东软20周年邮件签名logo(1(11-06-10-44-31).jpg>
> 
> 东软集团股份有限公司
> 上海市闵行区紫月路1000号东软软件园
> Postcode:200241
> Tel:(86 21) 33578591
> Fax:(86 21) 23025565-111
> Mobile:13764970711
> Email:ha...@neusoft.com <mailto:ha...@neusoft.com>
> Http://www.neusoft.com <http://www.neusoft.com/>
>  
>  
>  
>  
>  
> 
> ---
&g

Re: Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread 郝加来
right ,
but wo want a node 's throught is above million , so if the system hava fifty 
table , a single table can achive 2/s .





郝加来

From: Eric Stevens
Date: 2015-11-05 23:56
To: user@cassandra.apache.org
Subject: Re: why cassanra max is 2/s on a node ?
> 512G memory , 128core cpu


This seems dramatically oversized for a Cassandra node.  You'd do much better 
to have a much larger cluster of much smaller nodes.




On Thu, Nov 5, 2015 at 8:25 AM Jack Krupansky <jack.krupan...@gmail.com> wrote:

I don't know what current numbers are, but last year the idea of getting 1 
million writes per second on a 96 node cluster was considered a reasonable 
achievement. That would be roughly 10,000 writes per second per node and you 
are getting twice that.


See:
http://www.datastax.com/1-million-writes



Or this Google test which hit 1 million writes per second with 330 nodes, which 
would be roughly 3,000 writes per second per node:
http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html



So, is your question why your throughput is so good or are you disappointed 
that it wasn't better?


Cassandra is designed for clusters with lots of nodes, so if you want to get an 
accurate measure of per-node performance you need to test with a reasonable 
number of nodes and then divide aggregate performance by the number of nodes, 
not test a single node alone. In short, testing a single node in isolation is 
not a recommended approach to testing Cassandra performance.




-- Jack Krupansky


On Thu, Nov 5, 2015 at 9:05 AM, 郝加来 <ha...@neusoft.com> wrote:

hi
veryone 
i setup cassandra 2.2.3 on a node , the machine 's environment is openjdk-1.8.0 
, 512G memory , 128core cpu , 3T ssd .
the token num is 256 on a node , the program use datastax driver 2.1.8 and use 
5 thread to insert data to cassandra on the same machine , the data 's capcity 
is 6G  and 1157000 line .

why is the throughput 2/s on the node ?

# Per-thread stack size.
JVM_OPTS="$JVM_OPTS -Xss512k"

# Larger interned string table, for gossip's benefit (CASSANDRA-6410)
JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"

# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"  
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
JVM_OPTS="$JVM_OPTS -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"

memtable_heap_space_in_mb: 1024
memtable_offheap_space_in_mb: 10240
memtable_cleanup_threshold: 0.55 
memtable_allocation_type: heap_buffers 



以上
谢谢



 
郝加来

金融华东事业部


东软集团股份有限公司
上海市闵行区紫月路1000号东软软件园
Postcode:200241
Tel:(86 21) 33578591
Fax:(86 21) 23025565-111
Mobile:13764970711
Email:ha...@neusoft.com
Http://www.neusoft.com








---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---


---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please
immediately notify the sender by return e-mail, and delete the original message 
and all copies from
your system. Thank you.
---


Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Graham Sanderson
Also it sounds like you are reading the data from a single file - the problem 
could easily be with your load tool

try (as someone suggested) using cassandra stress

> On Nov 5, 2015, at 9:06 PM, Graham Sanderson <gra...@vast.com> wrote:
> 
> Agreed too. It also matters what you are inserting… if you are inserting to 
> the same (or small set of) partition key(s) you will be limited because 
> writes to the same partition key on a single node are atomic and isolated.
> 
>> On Nov 5, 2015, at 8:49 PM, Venkatesh Arivazhagan <venkey.a...@gmail.com 
>> <mailto:venkey.a...@gmail.com>> wrote:
>> 
>> I agree with Tyler! Have you tries increasing the the client threads from 5 
>> to a higher number?
>> 
>> On Nov 5, 2015 6:46 PM, "郝加来" <ha...@neusoft.com <mailto:ha...@neusoft.com>> 
>> wrote:
>> right ,
>> but wo want a node 's throught is above million , so if the system hava 
>> fifty table , a single table can achive 2/s .
>>  
>>  
>> 郝加来
>>  
>> From: Eric Stevens <mailto:migh...@gmail.com>
>> Date: 2015-11-05 23:56
>> To: user@cassandra.apache.org <mailto:user@cassandra.apache.org>
>> Subject: Re: why cassanra max is 2/s on a node ?
>> > 512G memory , 128core cpu
>> 
>> This seems dramatically oversized for a Cassandra node.  You'd do much 
>> better to have a much larger cluster of much smaller nodes.
>> 
>> 
>> On Thu, Nov 5, 2015 at 8:25 AM Jack Krupansky <jack.krupan...@gmail.com 
>> <mailto:jack.krupan...@gmail.com>> wrote:
>> I don't know what current numbers are, but last year the idea of getting 1 
>> million writes per second on a 96 node cluster was considered a reasonable 
>> achievement. That would be roughly 10,000 writes per second per node and you 
>> are getting twice that.
>> 
>> See:
>> http://www.datastax.com/1-million-writes 
>> <http://www.datastax.com/1-million-writes>
>> 
>> Or this Google test which hit 1 million writes per second with 330 nodes, 
>> which would be roughly 3,000 writes per second per node:
>> http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html
>>  
>> <http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html>
>> 
>> So, is your question why your throughput is so good or are you disappointed 
>> that it wasn't better?
>> 
>> Cassandra is designed for clusters with lots of nodes, so if you want to get 
>> an accurate measure of per-node performance you need to test with a 
>> reasonable number of nodes and then divide aggregate performance by the 
>> number of nodes, not test a single node alone. In short, testing a single 
>> node in isolation is not a recommended approach to testing Cassandra 
>> performance.
>> 
>> 
>> -- Jack Krupansky
>> 
>> On Thu, Nov 5, 2015 at 9:05 AM, 郝加来 <ha...@neusoft.com 
>> <mailto:ha...@neusoft.com>> wrote:
>> hi
>> veryone
>> i setup cassandra 2.2.3 on a node , the machine 's environment is 
>> openjdk-1.8.0 , 512G memory , 128core cpu , 3T ssd .
>> the token num is 256 on a node , the program use datastax driver 2.1.8 and 
>> use 5 thread to insert data to cassandra on the same machine , the data 's 
>> capcity is 6G  and 1157000 line .
>>  
>> why is the throughput 2/s on the node ?
>>  
>> # Per-thread stack size.
>> JVM_OPTS="$JVM_OPTS -Xss512k"
>>  
>> # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
>> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"
>>  
>> # GC tuning options
>> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
>> JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
>> JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4" 
>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>> JVM_OPTS="$JVM_OPTS -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
>> JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"
>>  
>> memtable_heap_space_in_mb: 1024
>

Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Jack Krupansky
I don't know what current numbers are, but last year the idea of getting 1
million writes per second on a 96 node cluster was considered a reasonable
achievement. That would be roughly 10,000 writes per second per node and
you are getting twice that.

See:
http://www.datastax.com/1-million-writes

Or this Google test which hit 1 million writes per second with 330 nodes,
which would be roughly 3,000 writes per second per node:
http://googlecloudplatform.blogspot.com/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html

So, is your question why your throughput is so good or are you disappointed
that it wasn't better?

Cassandra is designed for clusters with lots of nodes, so if you want to
get an accurate measure of per-node performance you need to test with a
reasonable number of nodes and then divide aggregate performance by the
number of nodes, not test a single node alone. In short, testing a single
node in isolation is not a recommended approach to testing Cassandra
performance.


-- Jack Krupansky

On Thu, Nov 5, 2015 at 9:05 AM, 郝加来  wrote:

> hi
> veryone
> i setup cassandra 2.2.3 on a node , the machine 's environment is
> openjdk-1.8.0 , 512G memory , 128core cpu , 3T ssd .
> the token num is 256 on a node , the program use datastax driver 2.1.8 and
> use 5 thread to insert data to cassandra on the same machine , the data 's
> capcity is 6G  and 1157000 line .
>
> why is the throughput 2/s on the node ?
>
>
> # Per-thread stack size.
>
> JVM_OPTS="$JVM_OPTS -Xss512k"
>
>
>
> # Larger interned string table, for gossip's benefit (CASSANDRA-6410)
>
> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"
>
>
>
> # GC tuning options
>
> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
>
> JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
>
> JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
>
> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>
> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>
> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>
> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"
>
> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
>
> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
>
> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>
> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>
> JVM_OPTS="$JVM_OPTS
> -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
>
> JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"
>
>
>
> memtable_heap_space_in_mb: 1024
>
> memtable_offheap_space_in_mb: 10240
>
> memtable_cleanup_threshold: 0.55
>
> memtable_allocation_type: heap_buffers
>
>
>
>
> 以上
> 谢谢
> --
>
> *郝加来*
>
> 金融华东事业部
>
> 东软集团股份有限公司
> 上海市闵行区紫月路1000号东软软件园
> Postcode:200241
> Tel:(86 21) 33578591
> Fax:(86 21) *23025565-111*
> Mobile:13764970711
> Email:ha...@neusoft.com
> Http://www.neusoft.com 
>
>
>
>
>
>
>
> ---
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
> storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---
>


why cassanra max is 20000/s on a node ?

2015-11-05 Thread 郝加来
hi
veryone 
i setup cassandra 2.2.3 on a node , the machine 's environment is openjdk-1.8.0 
, 512G memory , 128core cpu , 3T ssd .
the token num is 256 on a node , the program use datastax driver 2.1.8 and use 
5 thread to insert data to cassandra on the same machine , the data 's capcity 
is 6G  and 1157000 line .

why is the throughput 2/s on the node ?

# Per-thread stack size.
JVM_OPTS="$JVM_OPTS -Xss512k"
 
# Larger interned string table, for gossip's benefit (CASSANDRA-6410)
JVM_OPTS="$JVM_OPTS -XX:StringTableSize=103"
 
# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled"
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"  
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
JVM_OPTS="$JVM_OPTS -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"
JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=6"

memtable_heap_space_in_mb: 1024
memtable_offheap_space_in_mb: 10240
memtable_cleanup_threshold: 0.55 
memtable_allocation_type: heap_buffers 
 


以上
谢谢




郝加来

金融华东事业部


东软集团股份有限公司
上海市闵行区紫月路1000号东软软件园
Postcode:200241
Tel:(86 21) 33578591
Fax:(86 21) 23025565-111
Mobile:13764970711
Email:ha...@neusoft.com
Http://www.neusoft.com


---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please
immediately notify the sender by return e-mail, and delete the original message 
and all copies from
your system. Thank you.
---