[
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134635#comment-14134635
]
graham sanderson edited comment on CASSANDRA-7546 at 9/15/14 10:50 PM:
-----------------------------------------------------------------------
Finally getting back to this, been doing other things (this slightly lower
priority as we have it in production already) as well as keeping breaking
myself physically, requiring orthopedic visits! I just realized that the
version c6a2c65a75ade being voted on for 2.1.0 that I deployed is not the same
as 2.1.0 released. I am now upgrading, since cassandra-stress changes snuck in.
Note, than I plan to stress using 1024, 256, 16, 1 partitions, with all 5 nodes
up, and then with 4 nodes up and one down to test effect of hinting, (note repl
factor of 3 and cl=LOCAL_QUORUM), as well as with at least
memtable_allocation_type = heap_buffers & off_heap_buffers
I want to do one cell insert per batch... I'm upgrading in part because of the
new visit/revisit stuff - I'm not 100% sure how to use them correctly, I'll
keep playing but you may answer before I have finished upgrading and tried with
this. My first attempt on the original 2.1.0 revision, ended up with only one
clustering key value per partition which is not what I wanted (because it'll
make trees small)
Sample YAML for 1024 partitions
{code}
#
# This is an example YAML profile for cassandra-stress
#
# insert data
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1)
#
# read, using query simple1:
# cassandra-stress profile=/home/jake/stress1.yaml ops(simple1=1)
#
# mixed workload (90/10)
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1,simple1=9)
#
# Keyspace info
#
keyspace: stresscql
#
# The CQL for creating a keyspace (optional if it already exists)
#
keyspace_definition: |
CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy',
'replication_factor': 3};
#
# Table info
#
table: testtable
#
# The CQL for creating a table you wish to stress (optional if it already
exists)
#
table_definition: |
CREATE TABLE testtable (
p text,
c text,
v blob,
PRIMARY KEY(p, c)
) WITH COMPACT STORAGE
AND compaction = { 'class':'LeveledCompactionStrategy' }
AND comment='TestTable'
#
# Optional meta information on the generated columns in the above table
# The min and max only apply to text and blob types
# The distribution field represents the total unique population
# distribution of that column across rows. Supported types are
#
# EXP(min..max) An exponential distribution over
the range [min..max]
# EXTREME(min..max,shape) An extreme value (Weibull)
distribution over the range [min..max]
# GAUSSIAN(min..max,stdvrng) A gaussian/normal distribution,
where mean=(min+max)/2, and stdev is (mean-min)/stdvrng
# GAUSSIAN(min..max,mean,stdev) A gaussian/normal distribution,
with explicitly defined mean and stdev
# UNIFORM(min..max) A uniform distribution over the
range [min, max]
# FIXED(val) A fixed distribution, always
returning the same value
# Aliases: extr, gauss, normal, norm, weibull
#
# If preceded by ~, the distribution is inverted
#
# Defaults for all columns are size: uniform(4..8), population:
uniform(1..100B), cluster: fixed(1)
#
columnspec:
- name: p
size: fixed(16)
population: uniform(1..1024) # the range of unique values to select for
the field (default is 100Billion)
- name: c
size: fixed(26)
# cluster: uniform(1..100B)
- name: v
size: gaussian(50..250)
insert:
partitions: fixed(1) # number of unique partitions to update in a
single operation
# if batchcount > 1, multiple batches will be
used but all partitions will
# occur in all batches (unless they finish
early); only the row counts will vary
batchtype: LOGGED # type of batch to use
visits: fixed(10M) # not sure about this
queries:
simple1: select * from testtable where k = ? and v = ? LIMIT 10
{code}
Command-line
{code}
./cassandra-stress user profile=~/cqlstress-1024.yaml ops\(insert=1\)
cl=LOCAL_QUORUM -node $NODES -mode native prepared cql3 | tee
results/results-2.1.0-p1024-a.txt
{code}
was (Author: graham sanderson):
Finally getting back to this, been doing other things (this slightly lower
priority as we have it in production already)... I just realized that the
version c6a2c65a75ade being voted on for 2.1.0 that I deployed is not the same
as 2.1.0 released. I am now upgrading, since cassandra-stress changes snuck in.
Note, than I plan to stress using 1024, 256, 16, 1 partitions, with all 5 nodes
up, and then with 4 nodes up and one down to test effect of hinting, (note repl
factor of 3 and cl=LOCAL_QUORUM)
I want to do one cell insert per batch... I'm upgrading in part because of the
new visit/revisit stuff - I'm not 100% sure how to use them correctly, I'll
keep playing but you may answer before I have finished upgrading and tried with
this. My first attempt on the original 2.1.0 revision, ended up with only one
clustering key value per partition which is not what I wanted (because it'll
make trees small)
Sample YAML for 1024 partitions
{code}
#
# This is an example YAML profile for cassandra-stress
#
# insert data
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1)
#
# read, using query simple1:
# cassandra-stress profile=/home/jake/stress1.yaml ops(simple1=1)
#
# mixed workload (90/10)
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1,simple1=9)
#
# Keyspace info
#
keyspace: stresscql
#
# The CQL for creating a keyspace (optional if it already exists)
#
keyspace_definition: |
CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy',
'replication_factor': 3};
#
# Table info
#
table: testtable
#
# The CQL for creating a table you wish to stress (optional if it already
exists)
#
table_definition: |
CREATE TABLE testtable (
p text,
c text,
v blob,
PRIMARY KEY(p, c)
) WITH COMPACT STORAGE
AND compaction = { 'class':'LeveledCompactionStrategy' }
AND comment='TestTable'
#
# Optional meta information on the generated columns in the above table
# The min and max only apply to text and blob types
# The distribution field represents the total unique population
# distribution of that column across rows. Supported types are
#
# EXP(min..max) An exponential distribution over
the range [min..max]
# EXTREME(min..max,shape) An extreme value (Weibull)
distribution over the range [min..max]
# GAUSSIAN(min..max,stdvrng) A gaussian/normal distribution,
where mean=(min+max)/2, and stdev is (mean-min)/stdvrng
# GAUSSIAN(min..max,mean,stdev) A gaussian/normal distribution,
with explicitly defined mean and stdev
# UNIFORM(min..max) A uniform distribution over the
range [min, max]
# FIXED(val) A fixed distribution, always
returning the same value
# Aliases: extr, gauss, normal, norm, weibull
#
# If preceded by ~, the distribution is inverted
#
# Defaults for all columns are size: uniform(4..8), population:
uniform(1..100B), cluster: fixed(1)
#
columnspec:
- name: p
size: fixed(16)
population: uniform(1..1024) # the range of unique values to select for
the field (default is 100Billion)
- name: c
size: fixed(26)
# cluster: uniform(1..100B)
- name: v
size: gaussian(50..250)
insert:
partitions: fixed(1) # number of unique partitions to update in a
single operation
# if batchcount > 1, multiple batches will be
used but all partitions will
# occur in all batches (unless they finish
early); only the row counts will vary
batchtype: LOGGED # type of batch to use
visits: fixed(10M) # not sure about this
queries:
simple1: select * from testtable where k = ? and v = ? LIMIT 10
{code}
Command-line
{code}
./cassandra-stress user profile=~/cqlstress-1024.yaml ops\(insert=1\)
cl=LOCAL_QUORUM -node $NODES -mode native prepared cql3 | tee
results/results-2.1.0-p1024-a.txt
{code}
> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-7546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: graham sanderson
> Assignee: graham sanderson
> Fix For: 2.1.1
>
> Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt,
> 7546.20_4.txt, 7546.20_5.txt, 7546.20_6.txt, 7546.20_7.txt, 7546.20_7b.txt,
> 7546.20_alt.txt, 7546.20_async.txt, 7546.21_v1.txt, hint_spikes.png,
> suggestion1.txt, suggestion1_21.txt, young_gen_gc.png
>
>
> In order to preserve atomicity, this code attempts to read, clone/update,
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some
> fairly staggering memory growth (the more cores on your machine the worst it
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same
> partition, hinting today, does, and in this case wild (order(s) of magnitude
> more than expected) memory allocation rates can be seen (especially when the
> updates being hinted are small updates to different partitions which can
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation
> whilst not slowing down the very common un-contended case.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)