[jira] [Created] (CASSANDRA-12355) Install Errored: 'apt-get update' failed exit status: 100
VIJAY KUMAR VELPULA created CASSANDRA-12355: --- Summary: Install Errored: 'apt-get update' failed exit status: 100 Key: CASSANDRA-12355 URL: https://issues.apache.org/jira/browse/CASSANDRA-12355 Project: Cassandra Issue Type: Bug Components: Configuration Environment: DSE v4.8 OPSCenter v5.2.4 ubuntu 14.04 Reporter: VIJAY KUMAR VELPULA Attachments: apt-get update.txt while building new cluster from Opscenter 5.2.4 we are facing error "Install Errored: 'apt-get update' failed exit status: 100 ". Attaching stdout & stderr: logs. Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10253) Incremental repairs not working as expected with DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14732184#comment-14732184 ] vijay commented on CASSANDRA-10253: --- Marcus, The Issue 2 i reported is: In my normal data Ingest process i am seeing 10 or 20 max ssables how ever when we starts running nodetool incrimental repair ssatbable count grows to few thousands, i had seen ssable table count grows from 20 to 100K small sstables, and compactions never stops compacting these sstables. I am seeing this issue only when using DTCS and repair process. let me know your thoughts. > Incremental repairs not working as expected with DTCS > - > > Key: CASSANDRA-10253 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10253 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Pre-prod >Reporter: vijay >Assignee: Marcus Eriksson > Labels: dtcs > Fix For: 2.1.x > > Attachments: sstablemetadata-cluster-logs.zip, systemfiles 2.zip > > > HI, > we are ingesting data 6 million records every 15 mins into one DTCS table and > relaying on Cassandra for purging the data.Table Schema given below, Issue 1: > we are expecting to see table sstable created on day d1 will not be compacted > after d1 how we are not seeing this, how ever i see some data being purged at > random intervals > Issue 2: when we run incremental repair using "nodetool repair keyspace table > -inc -pr" each sstable is splitting up to multiple smaller SStables and > increasing the total storage.This behavior is same running repairs on any > node and any number of times > There are mutation drop's in the cluster > Table: > {code} > CREATE TABLE TableA ( > F1 text, > F2 int, > createts bigint, > stats blob, > PRIMARY KEY ((F1,F2), createts) > ) WITH CLUSTERING ORDER BY (createts DESC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '12', 'max_sstable_age_days': '1', > 'base_time_seconds': '50', 'class': > 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.0 > AND default_time_to_live = 93600 > AND gc_grace_seconds = 3600 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10253) Incremental repairs not working as expected with DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14732213#comment-14732213 ] vijay commented on CASSANDRA-10253: --- Also compactions are never stoped with no data ingestion: please find below results with no data ingestion and no curd operations over period of 3 days: nodetool compactionstats pending tasks: 4617 compaction type keyspace tablecompletedtotalunit progress Compaction upc1 alarmnote419486529421927219 bytes 99.42% Compaction upc1 alarmnote262150486657730463 bytes 39.86% Compaction upc1 alarmnote 52429308329877089 bytes 15.89% Compaction upc1 alarmnote 1149647113 3655964819 bytes 31.45% Active compaction remaining time : 0h00m47s nodetool compactionstats pending tasks: 14068 compaction type keyspace table completedtotalunit progress Compaction upc1 alarmnote 104863849516187168 bytes 20.32% Compaction upc1 alarmnote 576771960 3541850604 bytes 16.28% Compaction upc1 alarmnote 209717447542218900 bytes 38.68% Active compaction remaining time : 0h00m55s i had seen pending tasks: goes to 300,000 at times > Incremental repairs not working as expected with DTCS > - > > Key: CASSANDRA-10253 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10253 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Pre-prod >Reporter: vijay >Assignee: Marcus Eriksson > Labels: dtcs > Fix For: 2.1.x > > Attachments: sstablemetadata-cluster-logs.zip, systemfiles 2.zip > > > HI, > we are ingesting data 6 million records every 15 mins into one DTCS table and > relaying on Cassandra for purging the data.Table Schema given below, Issue 1: > we are expecting to see table sstable created on day d1 will not be compacted > after d1 how we are not seeing this, how ever i see some data being purged at > random intervals > Issue 2: when we run incremental repair using "nodetool repair keyspace table > -inc -pr" each sstable is splitting up to multiple smaller SStables and > increasing the total storage.This behavior is same running repairs on any > node and any number of times > There are mutation drop's in the cluster > Table: > {code} > CREATE TABLE TableA ( > F1 text, > F2 int, > createts bigint, > stats blob, > PRIMARY KEY ((F1,F2), createts) > ) WITH CLUSTERING ORDER BY (createts DESC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '12', 'max_sstable_age_days': '1', > 'base_time_seconds': '50', 'class': > 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.0 > AND default_time_to_live = 93600 > AND gc_grace_seconds = 3600 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10253) Incrimental repairs not working as expected with DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729171#comment-14729171 ] vijay commented on CASSANDRA-10253: --- Marcus, we are using vnodes,I reported two issues in this jira, Even without running Incremental backups we still have Issue-1(please find Issue-1 description above) > Incrimental repairs not working as expected with DTCS > - > > Key: CASSANDRA-10253 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10253 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Pre-prod >Reporter: vijay >Assignee: Marcus Eriksson > Fix For: 2.1.x > > Attachments: systemfiles 2.zip > > > HI, > we are ingesting data 6 million records every 15 mins into one DTCS table and > relaying on Cassandra for purging the data.Table Schema given below, Issue 1: > we are expecting to see table sstable created on day d1 will not be compacted > after d1 how we are not seeing this, how ever i see some data being purged at > random intervals > Issue 2: when we run incremental repair using "nodetool repair keyspace table > -inc -pr" each sstable is splitting up to multiple smaller SStables and > increasing the total storage.This behavior is same running repairs on any > node and any number of times > There are mutation drop's in the cluster > Table: > {code} > CREATE TABLE TableA ( > F1 text, > F2 int, > createts bigint, > stats blob, > PRIMARY KEY ((F1,F2), createts) > ) WITH CLUSTERING ORDER BY (createts DESC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '12', 'max_sstable_age_days': '1', > 'base_time_seconds': '50', 'class': > 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.0 > AND default_time_to_live = 93600 > AND gc_grace_seconds = 3600 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10055) High CPU load for Cassandra 2.1.8
[ https://issues.apache.org/jira/browse/CASSANDRA-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727979#comment-14727979 ] vijay commented on CASSANDRA-10055: --- HI, Benedict controlling the number of coreConnPerHostLocal and maxConnPerHostLocal helped in cpu load issue we are facing. > High CPU load for Cassandra 2.1.8 > - > > Key: CASSANDRA-10055 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10055 > Project: Cassandra > Issue Type: Bug > Components: Config > Environment: Prod >Reporter: vijay > Labels: triaged > Fix For: 2.1.x > > Attachments: dstst-lcdn.log, dstst-lcdn2.log, dstst-lcdn3.log, > dstst-lcdn4.log, dstst-lcdn5.log, dstst-lcdn6.log, js.log, js2.log, js3.log, > js4.log, js5.log, js6.log, top-bHn1-2.log, top-bHn1-3.log, top-bHn1-4.log, > top-bHn1-5.log, top-bHn1-6.log, top-bHn1.log > > > We are seeing High CPU Load about 80% to 100% in Cassandra 2.1.8 when doing > Data ingest, we did not had this issue in 2.0.x version of Cassandra > we tested this in different Cloud platforms and results are same. > CPU: Tested with M3 2xlarge AWS instances > Ingest rate: Injecting 1 million Inserts each insert is of 1000bytes > no other Operations are happening except inserts in Cassandra > let me know if more info is needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10253) Incrimental repairs not working as expected with DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728076#comment-14728076 ] vijay commented on CASSANDRA-10253: --- HI we are running cassandra 2.1.8 > Incrimental repairs not working as expected with DTCS > - > > Key: CASSANDRA-10253 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10253 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Pre-prod >Reporter: vijay > > HI, > we are ingesting data 6 million records every 15 mins into one DTCS table and > relaying on Cassandra for purging the data.Table Schema given below, Issue 1: > we are expecting to see table sstable created on day d1 will not be compacted > after d1 how we are not seeing this, how ever i see some data being purged at > random intervals > Issue 2: when we run incremental repair using "nodetool repair keyspace table > -inc -pr" each sstable is splitting up to multiple smaller SStables and > increasing the total storage.This behavior is same running repairs on any > node and any number of times > There are mutation drop's in the cluster > Table: > {code} > CREATE TABLE TableA ( > F1 text, > F2 int, > createts bigint, > stats blob, > PRIMARY KEY ((F1,F2), createts) > ) WITH CLUSTERING ORDER BY (createts DESC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '12', 'max_sstable_age_days': '1', > 'base_time_seconds': '50', 'class': > 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.0 > AND default_time_to_live = 93600 > AND gc_grace_seconds = 3600 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10253) Incrimental repairs not working as expected with DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vijay updated CASSANDRA-10253: -- Attachment: (was: systemfiles 2.zip) > Incrimental repairs not working as expected with DTCS > - > > Key: CASSANDRA-10253 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10253 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Pre-prod >Reporter: vijay > Fix For: 2.1.x > > Attachments: systemfiles 2.zip > > > HI, > we are ingesting data 6 million records every 15 mins into one DTCS table and > relaying on Cassandra for purging the data.Table Schema given below, Issue 1: > we are expecting to see table sstable created on day d1 will not be compacted > after d1 how we are not seeing this, how ever i see some data being purged at > random intervals > Issue 2: when we run incremental repair using "nodetool repair keyspace table > -inc -pr" each sstable is splitting up to multiple smaller SStables and > increasing the total storage.This behavior is same running repairs on any > node and any number of times > There are mutation drop's in the cluster > Table: > {code} > CREATE TABLE TableA ( > F1 text, > F2 int, > createts bigint, > stats blob, > PRIMARY KEY ((F1,F2), createts) > ) WITH CLUSTERING ORDER BY (createts DESC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '12', 'max_sstable_age_days': '1', > 'base_time_seconds': '50', 'class': > 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.0 > AND default_time_to_live = 93600 > AND gc_grace_seconds = 3600 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10253) Incrimental repairs not working as expected with DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vijay updated CASSANDRA-10253: -- Attachment: systemfiles 2.zip > Incrimental repairs not working as expected with DTCS > - > > Key: CASSANDRA-10253 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10253 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Pre-prod >Reporter: vijay > Fix For: 2.1.x > > Attachments: systemfiles 2.zip, systemfiles 2.zip > > > HI, > we are ingesting data 6 million records every 15 mins into one DTCS table and > relaying on Cassandra for purging the data.Table Schema given below, Issue 1: > we are expecting to see table sstable created on day d1 will not be compacted > after d1 how we are not seeing this, how ever i see some data being purged at > random intervals > Issue 2: when we run incremental repair using "nodetool repair keyspace table > -inc -pr" each sstable is splitting up to multiple smaller SStables and > increasing the total storage.This behavior is same running repairs on any > node and any number of times > There are mutation drop's in the cluster > Table: > {code} > CREATE TABLE TableA ( > F1 text, > F2 int, > createts bigint, > stats blob, > PRIMARY KEY ((F1,F2), createts) > ) WITH CLUSTERING ORDER BY (createts DESC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '12', 'max_sstable_age_days': '1', > 'base_time_seconds': '50', 'class': > 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.0 > AND default_time_to_live = 93600 > AND gc_grace_seconds = 3600 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10253) Incrimental repairs not working as expected with DTCS
vijay created CASSANDRA-10253: - Summary: Incrimental repairs not working as expected with DTCS Key: CASSANDRA-10253 URL: https://issues.apache.org/jira/browse/CASSANDRA-10253 Project: Cassandra Issue Type: Bug Components: Core Environment: Pre-prod Reporter: vijay HI, we are ingesting data 6 million records every 15 mins into one DTCS table and relaying on Cassandra for purging the data.Table Schema given below, Issue 1: we are expecting to see table sstable created on day d1 will not be compacted after d1 how we are not seeing this, how ever i see some data being purged at random intervals Issue 2: when we run incremental repair using "nodetool repair keyspace table -inc -pr" each sstable is splitting up to multiple smaller SStables and increasing the total storage.This behavior is same running repairs on any node and any number of times There are mutation drop's in the cluster Table: CREATE TABLE TableA ( F1 text, F2 int, createts bigint, stats blob, PRIMARY KEY ((F1,F2), createts) ) WITH CLUSTERING ORDER BY (createts DESC) AND bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'min_threshold': '12', 'max_sstable_age_days': '1', 'base_time_seconds': '50', 'class': 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 93600 AND gc_grace_seconds = 3600 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10253) Incrimental repairs not working as expected with DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vijay updated CASSANDRA-10253: -- Attachment: systemfiles 2.zip Attached system.log files from all 6 nodes > Incrimental repairs not working as expected with DTCS > - > > Key: CASSANDRA-10253 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10253 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Pre-prod >Reporter: vijay > Fix For: 2.1.x > > Attachments: systemfiles 2.zip > > > HI, > we are ingesting data 6 million records every 15 mins into one DTCS table and > relaying on Cassandra for purging the data.Table Schema given below, Issue 1: > we are expecting to see table sstable created on day d1 will not be compacted > after d1 how we are not seeing this, how ever i see some data being purged at > random intervals > Issue 2: when we run incremental repair using "nodetool repair keyspace table > -inc -pr" each sstable is splitting up to multiple smaller SStables and > increasing the total storage.This behavior is same running repairs on any > node and any number of times > There are mutation drop's in the cluster > Table: > {code} > CREATE TABLE TableA ( > F1 text, > F2 int, > createts bigint, > stats blob, > PRIMARY KEY ((F1,F2), createts) > ) WITH CLUSTERING ORDER BY (createts DESC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '12', 'max_sstable_age_days': '1', > 'base_time_seconds': '50', 'class': > 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.0 > AND default_time_to_live = 93600 > AND gc_grace_seconds = 3600 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10055) High CPU load for Cassandra 2.1.8
[ https://issues.apache.org/jira/browse/CASSANDRA-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700173#comment-14700173 ] vijay commented on CASSANDRA-10055: --- Benedict, i the jstack and top command are taken relatively close to each other, i will try to get more statistics on this and get back. Thanks High CPU load for Cassandra 2.1.8 - Key: CASSANDRA-10055 URL: https://issues.apache.org/jira/browse/CASSANDRA-10055 Project: Cassandra Issue Type: Bug Components: Config Environment: Prod Reporter: vijay Attachments: dstst-lcdn.log, dstst-lcdn2.log, dstst-lcdn3.log, dstst-lcdn4.log, dstst-lcdn5.log, dstst-lcdn6.log, js.log, js2.log, js3.log, js4.log, js5.log, js6.log, top-bHn1-2.log, top-bHn1-3.log, top-bHn1-4.log, top-bHn1-5.log, top-bHn1-6.log, top-bHn1.log We are seeing High CPU Load about 80% to 100% in Cassandra 2.1.8 when doing Data ingest, we did not had this issue in 2.0.x version of Cassandra we tested this in different Cloud platforms and results are same. CPU: Tested with M3 2xlarge AWS instances Ingest rate: Injecting 1 million Inserts each insert is of 1000bytes no other Operations are happening except inserts in Cassandra let me know if more info is needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10055) High CPU load for Cassandra 2.1.8
[ https://issues.apache.org/jira/browse/CASSANDRA-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vijay updated CASSANDRA-10055: -- Attachment: dstst-lcdn5.log dstst-lcdn4.log dstst-lcdn3.log dstst-lcdn2.log dstst-lcdn6.log dstst-lcdn.log js2.log js3.log js4.log js5.log js6.log top-bHn1-6.log top-bHn1-5.log top-bHn1-4.log top-bHn1-3.log top-bHn1-2.log js.log top-bHn1.log Attached : Jstack ,Top , dstat -lcdn out put for 6 node cassandra 2.1.8 cluster High CPU load for Cassandra 2.1.8 - Key: CASSANDRA-10055 URL: https://issues.apache.org/jira/browse/CASSANDRA-10055 Project: Cassandra Issue Type: Bug Components: Config Environment: Prod Reporter: vijay Attachments: dstst-lcdn.log, dstst-lcdn2.log, dstst-lcdn3.log, dstst-lcdn4.log, dstst-lcdn5.log, dstst-lcdn6.log, js.log, js2.log, js3.log, js4.log, js5.log, js6.log, top-bHn1-2.log, top-bHn1-3.log, top-bHn1-4.log, top-bHn1-5.log, top-bHn1-6.log, top-bHn1.log We are seeing High CPU Load about 80% to 100% in Cassandra 2.1.8 when doing Data ingest, we did not had this issue in 2.0.x version of Cassandra we tested this in different Cloud platforms and results are same. CPU: Tested with M3 2xlarge AWS instances Ingest rate: Injecting 1 million Inserts each insert is of 1000bytes no other Operations are happening except inserts in Cassandra let me know if more info is needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10055) High CPU load for Cassandra 2.1.8
[ https://issues.apache.org/jira/browse/CASSANDRA-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695465#comment-14695465 ] vijay commented on CASSANDRA-10055: --- HI Benedict, Attached all the files you requested, I am using 2.1.7 divers in 4 application servers and tested cluster with Default settings like ConnPer (local,Remote) min=max=1 and maxReqPerConnLocal=1024 and tested up to the below given settings, either way i see bust in average CPU load connection settings: cassandra.maxReqPerConnLocal=2048 cassandra.maxReqPerConnRemote=1000 cassandra.newConnThresholdLocal=400 cassandra.NewConnThresholdRemote=200 cassandra.coreConnPerHostLocal=20 cassandra.maxConnPerHostLocal=20 cassandra.maxConnPerHostRemote=10 cassandra.coreConnPerHostRemote=10 please let me know your thoughts Thanks High CPU load for Cassandra 2.1.8 - Key: CASSANDRA-10055 URL: https://issues.apache.org/jira/browse/CASSANDRA-10055 Project: Cassandra Issue Type: Bug Components: Config Environment: Prod Reporter: vijay Attachments: dstst-lcdn.log, dstst-lcdn2.log, dstst-lcdn3.log, dstst-lcdn4.log, dstst-lcdn5.log, dstst-lcdn6.log, js.log, js2.log, js3.log, js4.log, js5.log, js6.log, top-bHn1-2.log, top-bHn1-3.log, top-bHn1-4.log, top-bHn1-5.log, top-bHn1-6.log, top-bHn1.log We are seeing High CPU Load about 80% to 100% in Cassandra 2.1.8 when doing Data ingest, we did not had this issue in 2.0.x version of Cassandra we tested this in different Cloud platforms and results are same. CPU: Tested with M3 2xlarge AWS instances Ingest rate: Injecting 1 million Inserts each insert is of 1000bytes no other Operations are happening except inserts in Cassandra let me know if more info is needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10055) High CPU load for Cassandra 2.1.8
vijay created CASSANDRA-10055: - Summary: High CPU load for Cassandra 2.1.8 Key: CASSANDRA-10055 URL: https://issues.apache.org/jira/browse/CASSANDRA-10055 Project: Cassandra Issue Type: Bug Components: Config Environment: Prod Reporter: vijay We are seeing High CPU Load about 80% to 100% in Cassandra 2.1.8 when doing Data ingest, we did not had this issue in 2.0.x version of Cassandra we tested this in different Cloud platforms and results are same. CPU: Tested with M3 2xlarge AWS instances Ingest rate: Injecting 1 million Inserts each insert is of 1000bytes no other Operations are happening except inserts in Cassandra let me know if more info is needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-3930) CommitLogSegment uses RandomAccessFile which doesnt have fadvice
[ https://issues.apache.org/jira/browse/CASSANDRA-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay reassigned CASSANDRA-3930: Assignee: (was: Vijay) CommitLogSegment uses RandomAccessFile which doesnt have fadvice Key: CASSANDRA-3930 URL: https://issues.apache.org/jira/browse/CASSANDRA-3930 Project: Cassandra Issue Type: Improvement Affects Versions: 1.0.7 Reporter: Vijay Priority: Minor Wondering if we even need MMap file in this case as it is always sequential. If yes, we need it we might want to replace logFileAccessor = new RandomAccessFile(logFile, rw); to logFileAccessor = SequentialWriter.open(logFile, true); in CommitLogSegment -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-6544) Reduce GC activity during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay reassigned CASSANDRA-6544: Assignee: (was: Vijay) Reduce GC activity during compaction Key: CASSANDRA-6544 URL: https://issues.apache.org/jira/browse/CASSANDRA-6544 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Vijay Labels: compaction Fix For: 2.1.3 We are noticing increase in P99 while the compactions are running at full stream. Most of it is because of the increased GC activity (followed by full GC). The obvious solution/work around is to throttle the compactions, but with SSD's we can get more disk bandwidth for reads and compactions. It will be nice to move the compaction object allocations off heap. First thing to do might be create a Offheap Slab allocator with the size as the compaction in memory size and recycle it. Also we might want to make it configurable so folks can disable it when they don't have off-heap memory to reserve. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231327#comment-14231327 ] Vijay commented on CASSANDRA-7438: -- [~snazy] I was trying to compare the OHC and found few major bugs. 1) You have individual method synchronization on the Map, which doesn't ensure that your get is locked before a put is performed (same with clean, hot(N), remove etc), look at SynchronizedMap source code to do it right else will crash soon. 2) Even after i fix it, there is correctness in the hashing algorithm i think. Get returns a lot of error and looks like there is some memory leaks too. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231327#comment-14231327 ] Vijay edited comment on CASSANDRA-7438 at 12/2/14 11:33 AM: [~snazy] I was trying to compare the OHC and found few major bugs. There is correctness in the hashing algorithm i think. Get returns a lot of error and looks like there is some memory leaks too. was (Author: vijay2...@yahoo.com): [~snazy] I was trying to compare the OHC and found few major bugs. 1) You have individual method synchronization on the Map, which doesn't ensure that your get is locked before a put is performed (same with clean, hot(N), remove etc), look at SynchronizedMap source code to do it right else will crash soon. 2) Even after i fix it, there is correctness in the hashing algorithm i think. Get returns a lot of error and looks like there is some memory leaks too. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231878#comment-14231878 ] Vijay commented on CASSANDRA-7438: -- Never mind, my bad it was related the below (which needs to be more configurable instead) and the items where going missing earlier than i thought it should and looks you just evict the items per segment (If a segment is used more more items will disappear from that segment and the lest used segment items will remain). {code} // 12.5% if capacity less than 8GB // 10% if capacity less than 16 GB // 5% if capacity is higher than 16GB {code} Also noticed you don't have replace which Cassandra uses. Anyways i am going to stop working on this for now, let me know if someone wants any other info. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231878#comment-14231878 ] Vijay edited comment on CASSANDRA-7438 at 12/2/14 8:31 PM: --- EDIT: Here is the explanation. run the benchmark with the following options (lruc benchmark). {code}java -Djava.library.path=/usr/local/lib/ -jar ~/lrucTest.jar -t 30 -s 6147483648 -c ohc{code} And you will see something like this (errors == not found from the cache even though you have all the items you need is in the cache). {code} Memory consumed: 3 GB / 5 GB or 427170 / 6147483648, size 4980, queued (LRU q size) 0 VM total:2 GB VM free:2 GB Get Operation (micros) time_taken, count, mean, median, 99thPercentile, 999thPercentile, error 4734724, 166, 2.42, 1.93, 8.58, 24.74, 166 4804375, 166, 2.40, 1.92, 4.56, 106.23, 166 4805858, 166, 2.45, 1.95, 3.94, 11.76, 166 4842886, 166, 2.40, 1.92, 7.46, 26.73, 166 {code} You really need test cases :) Anyways i am going to stop working on this ticket now, let me know if someone wants any other info. was (Author: vijay2...@yahoo.com): Never mind, my bad it was related the below (which needs to be more configurable instead) and the items where going missing earlier than i thought it should and looks you just evict the items per segment (If a segment is used more more items will disappear from that segment and the lest used segment items will remain). {code} // 12.5% if capacity less than 8GB // 10% if capacity less than 16 GB // 5% if capacity is higher than 16GB {code} Also noticed you don't have replace which Cassandra uses. Anyways i am going to stop working on this for now, let me know if someone wants any other info. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228871#comment-14228871 ] Vijay commented on CASSANDRA-7438: -- Should be taken care of too, it should become a duplicate delete to the queue and should work normally (by itemUnlinkQueue). Here is the adjusted test case for it. https://github.com/Vijay2win/lruc/blob/master/src/test/java/com/lruc/unsafe/UnsafeQueueTest.java#L81 Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228523#comment-14228523 ] Vijay commented on CASSANDRA-7438: -- {quote}I would break out the performance comparison with and without warming up the cache so we know how it performs when you aren't measuring the resize pauses.{quote} Yep and in stedy state it is similar to get and I have verified that the latency is due to rehash. Better benchmarks on bug machines will be done on Monday. Unfortunately -1 on partitions, it will be a lot more complex and will be hard to understand for users. If we have to expand the partitions, we have to figure out a better consistent hashing algo. Cassandra within Cassandra may be. More over we will end up having the current code as is to move maps and queues offheap. Sorry I don't understand the argument of code complexity. If we are talking about code complexity. The unsafe code is 1000 lines including the license headers :) The current contention topic is weather to use cas for locks. Which is showing higher cpu cost and I agree with Pavel on latencies as shown in the numbers. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228524#comment-14228524 ] Vijay commented on CASSANDRA-7438: -- PS: all the latency spikes are in 100' of micros. It's day and night comparison to current cache :) Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228523#comment-14228523 ] Vijay edited comment on CASSANDRA-7438 at 11/28/14 9:47 PM: {quote}I would break out the performance comparison with and without warming up the cache so we know how it performs when you aren't measuring the resize pauses.{quote} Yep and in stedy state it is similar to get and I have verified that the latency is due to rehash. Better benchmarks on bug machines will be done on Monday. Unfortunately -1 on partitions, it will be a lot more complex and will be hard to understand for users. If we have to expand the partitions, we have to figure out a better consistent hashing algo. Cassandra within Cassandra may be. More over we will end up having the current code as is to move maps and queues offheap. Sorry I don't understand the argument of code complexity. If we are talking about code complexity. The unsafe code is 1000 lines including the license headers :) The current contention topic is weather to use cas for locks. Which is showing higher cpu cost and I agree with Pavel on locks also shows up on the numbers. was (Author: vijay2...@yahoo.com): {quote}I would break out the performance comparison with and without warming up the cache so we know how it performs when you aren't measuring the resize pauses.{quote} Yep and in stedy state it is similar to get and I have verified that the latency is due to rehash. Better benchmarks on bug machines will be done on Monday. Unfortunately -1 on partitions, it will be a lot more complex and will be hard to understand for users. If we have to expand the partitions, we have to figure out a better consistent hashing algo. Cassandra within Cassandra may be. More over we will end up having the current code as is to move maps and queues offheap. Sorry I don't understand the argument of code complexity. If we are talking about code complexity. The unsafe code is 1000 lines including the license headers :) The current contention topic is weather to use cas for locks. Which is showing higher cpu cost and I agree with Pavel on latencies as shown in the numbers. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228569#comment-14228569 ] Vijay commented on CASSANDRA-7438: -- {quote}The queue is maintained by a separate thread that requires signalling{quote} Thread is only signalled if they are not performing operation. I am lost. {quote}resulting in a memory leak{quote} I am 100% sure that this is not true. Can you write a test case for it to make this happen plz? {quote}but prevent all reader or writer threads from making progress by taking the locks for all buckets immediately{quote} I am sure this cannot be done, if you don't write you loose coherence and consistency. {quote}During a grow, we can lose puts because we unlock the old segments{quote} test case again plz. I don't think this can happen too. I spend a lot of time testing the exact scenario. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228577#comment-14228577 ] Vijay commented on CASSANDRA-7438: -- May be you know better than me, but map.remove cannot be followed by a sucessful map.get because the remove is within a lock on the segment... Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-7438: - Attachment: tests.zip Hi Jonathan, We should at least make the cache more pluggable with this ticket. {quote}Well, it can, but it's almost always a bad idea. Not something we should optimize for.{quote} Just my 2 cents: I have seen some use cases which store bigger blobs as column values too... Sorry to diverge the discussion again. Alright, completed most of the items in this ticket. * Expiry thread to proactively remove items (looks like it catches up withe load pretty well). * Some minor comments on metrics and a bit of refactor. (will have to revisit all the discussion again) * Some error handling and additional tests. Looks like lruc JNI based and Unsafe are close to each other. Numbers attached... * The difference is that unsafe is performing worse at P99 distributions and C implementation at P999. * I think the performance hit on both caches are during rehash global locks. * Note these numbers are in my laptop and during the run's CPU's are 100% so might not be perfect. * Will have to run benchmark on server grade (big memory) machines on Monday. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch, tests.zip Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226552#comment-14226552 ] Vijay commented on CASSANDRA-7438: -- {quote}One thing I noticed during benchmarking is that (concurrent?){quote} Yes, use these options, feel free to make it more configurable if you need. {code} public static final String TYPE = c; public static final String THREADS = t; public static final String SIZE = s; public static final String ITERATIONS = i; public static final String PREFIX_SIZE = p; {code} Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224904#comment-14224904 ] Vijay commented on CASSANDRA-7438: -- {quote} sun.misc.Hashing doesn't seem to exist for me, maybe a Java 8 issue? StatsHolder, same AtomicLongArray suggestion. Also consider LongAdder. {quote} Yep, and let me find alternatives for Java 8 (and until 8 for LongAdder). {quote} The queue really needs to be bounded, producer and consumer could proceed at different rates. In Segment.java in the replace path AtomicLong.addAndGet is called back to back, could be called once with the math already done. I believe each of those stalls processing until the store buffers have flushed. The put path does something similar and could have the same optimization. {quote} Yeah those where a oversight. {quote} Tasks submitted to executor services via submit will wrap the result including exceptions in a future which silently discards them. The library might take at initialization time a listener for these errors, or if it is going to be C* specific it could use the wrapped runnable or similar. {quote} Are you suggesting a configurable logging/exception handling in case the 2 threads throw exceptions? If yes sure. Other exceptions AFAIK are already propagated. (Still needs cleanup though). {quote} A lot of locking that was spin locking (which unbounded I don't think is great) is now blocking locking. There is no adaptive spinning if you don't use synchronized. If you are already using unsafe maybe you could do monitor enter/exit. Never tried it. Having the table (segments) on heap is pretty undesirable to me. Happy to be proved wrong, but I think a flyweight over off heap would be better. {quote} Segments are small in memory so far in my tests, The spin lock is to make sure the lock checks the segment if rehash happened or not, this is better than having a seperate lock which will be central. (No different than java or memcached). Not sure if i understand the UNSAFE lock any example will help. The segments are in heap mainly to handle the locking, I think we can do a bit of CAS but global lock on rehashing will be a problem (May be an alternate approach is required). {quote} It looks like concurrent calls to rehash could cause the table to rehash twice since the rebalance field is not CASed. You should do the volatile read, and then attempt the CAS (avoids putting the cache line in exclusive state every time). {quote} Nope it is Single threaded Executor and the rehash boolean is already volatile :) Next commit will have conditions instead (similar to C implementation). {quote} If the expiration lock is already locked some other thread is doing the expiration work. You might keep a semaphore for puts that bypass the lock so other threads can move on during expiration. I suppose after the first few evictions new puts will move on anyways. This would show up in a profiler if it were happening. {quote} Good point… Or a tryLock to spin and check if some other thread released enough memory. {quote} hotN looks like it could lock for quite a while (hundreds of milliseconds, seconds) depending on the size of N. You don't need to use a linked list for the result just allocate an array list of size N. Maybe hotN should be able to yield, possibly leaving behind an iterator that evictors will have to repair. Maybe also depends on how top N handles duplicate or multiple versions of keys. Alternatively hotN could take a read lock, and writers could skip the cache? {quote} We cannot have duplicates in the Queue (remember it is a double linked list of items in cache). Read locks q_expiry_lock is all we need, let me fix it. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225453#comment-14225453 ] Vijay commented on CASSANDRA-7438: -- {quote} Segments are hash buckets correct? {quote} Yes and the way memcached and lruc is does the rehashing is based on this algorithm, and hence yes... That was the argument earlier about JNI based solution. (Also another reason I was talking about of configurable hash expansion capability in my previous comment) {code} unsigned long current_size = cas_incr(stats.hash_items, 1); if (current_size (hashsize(hashpower) * 3) / 2) { assoc_start_expand(); } {code} if we don't like the constant overhead of the cache in heap and If you are talking about CAS which we already do for ref counting, as mentioned before we need an alternative strategy for global locks for rebalance if we go with lock less strategy. {quote} The task submitted to the executor doesn't check whether another rehash is required it just does it. {quote} Until you complete a rehash you don't know if you need to hash again or not... Am i missing something? Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225532#comment-14225532 ] Vijay commented on CASSANDRA-7438: -- {quote} so each segment would be a 4-byte lock {quote} Are you talking about setting Just setting 1 for lock and 0 for unlock? Hmmm, alright thats doable... i am guessing you have already seen how ReentrantLock implements locking. {quote} The check on line 38 races with the assignment on line 39. {quote} I thought we discussed this already... Yeah that was suppose to take care by this comment Next commit will have conditions instead (similar to C implementation)., have not committed it yet :) Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1495#comment-1495 ] Vijay commented on CASSANDRA-7438: -- Alright the first version of pure Java version of LRUCache pushed, * Basically a port from the C version. (Most of the test cases pass and they are the same for both versions) * As ariel mentioned before we can use disruptor for the ring buffer but this doesn't use it yet. * Expiry in the queue thread is not implemented yet. * Algorithm to start the rehash needs to be more configurable and based on the capacity will be pushing that soon. * Overhead in JVM heap is just the segments array. https://github.com/Vijay2win/lruc/tree/master/src/main/java/com/lruc/unsafe Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1495#comment-1495 ] Vijay edited comment on CASSANDRA-7438 at 11/23/14 3:23 AM: Alright the first version of pure Java version of LRUCache pushed, * Basically a port from the C version. (Most of the test cases pass and they are the same for both versions) * As Ariel mentioned before... we can use disruptor for the ring buffer, current version doesnt use it yet. * Proactive expiry in the queue thread is not implemented yet. * Algorithm to start the rehash needs to be more configurable, and based on the capacity will be pushing that soon. * Overhead in JVM heap is just the segments array, hence should be able to grow as much as the system can support. https://github.com/Vijay2win/lruc/tree/master/src/main/java/com/lruc/unsafe was (Author: vijay2...@yahoo.com): Alright the first version of pure Java version of LRUCache pushed, * Basically a port from the C version. (Most of the test cases pass and they are the same for both versions) * As ariel mentioned before we can use disruptor for the ring buffer but this doesn't use it yet. * Expiry in the queue thread is not implemented yet. * Algorithm to start the rehash needs to be more configurable and based on the capacity will be pushing that soon. * Overhead in JVM heap is just the segments array. https://github.com/Vijay2win/lruc/tree/master/src/main/java/com/lruc/unsafe Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8362) Reduce memory usage of RefCountedMemory
Vijay created CASSANDRA-8362: Summary: Reduce memory usage of RefCountedMemory Key: CASSANDRA-8362 URL: https://issues.apache.org/jira/browse/CASSANDRA-8362 Project: Cassandra Issue Type: Bug Reporter: Vijay Assignee: Vijay Priority: Minor We can store the references as the first 4 bytes of the Unsafe memory and use case CAS[1] for reference counting of the memory. This change will reduce the object over head + additional 4 bytes from java heap. Calling methods can reference as long's. [1] http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/atomic/AtomicInteger.java#AtomicInteger.incrementAndGet%28%29 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14198523#comment-14198523 ] Vijay commented on CASSANDRA-7438: -- Alright looks like the objection is not on the design but the language choice, if i knew the implementation details it would have been a easier choice in the first place (the argument earlier was that we don't have a way to lock and use the queue easier), for example the map vs queue etc The thing which we are missing is 4 months of dev, testing and reviewers time :). Its alright let me give it a shot and after all we have an alternative to benchmark on. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196492#comment-14196492 ] Vijay commented on CASSANDRA-7438: -- {quote} well I think you run into another issue which is that the ring buffer doesn't appear to check for queue full? {quote} Yeah i thought about it, we need to handle those and thats why didn't have it in the first place. Should not be really bad though. {quote} I don't agree that Unsafe couldn't do the exact same thing with no on heap references {quote} Probably, since we figured most of the implementation detail sure we can but still there is always many different ways to solve the problem (Even though it will be in efficient to copy multiple bytes to get to the next items in map etc... GC and CPU overhead would be more IMHO). For example Memcached used expiration time set by the clients to remove the items which made it easier for them to do the slab allocator but this is something we removed it in lruc and just a queue. {quote} I also wonder if splitting the cache into several instances each with a coarse lock per instance wouldn't result in simpler {quote} The problem there is how will you invalidate the last used items, since they are different partitions you really don't know which ones to invalidate... there is also a problem of load balancing when to expand the buckets etc which will bring us back to the current lock stripping solutions IMHO. I can do some benchmarks if thats exactly what we need at this point Thanks! Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196492#comment-14196492 ] Vijay edited comment on CASSANDRA-7438 at 11/4/14 6:15 PM: --- {quote} well I think you run into another issue which is that the ring buffer doesn't appear to check for queue full? {quote} Yeah i thought about it, we need to handle those and thats why didn't have it in the first place. Should not be really bad though. {quote} I don't agree that Unsafe couldn't do the exact same thing with no on heap references {quote} Probably, since we figured most of the implementation detail sure we can but still there is always many different ways to solve the problem (May not be very efficient to copy multiple bytes to get to the next item in map etc... GC and CPU overhead would be more IMHO). For example Memcached used expiration time set by the clients to remove the items which made it easier for them to do the slab allocator but this is something we removed it in lruc and just a queue. {quote} I also wonder if splitting the cache into several instances each with a coarse lock per instance wouldn't result in simpler {quote} The problem there is how will you invalidate the last used items, since they are different partitions you really don't know which ones to invalidate... there is also a problem of load balancing when to expand the buckets etc which will bring us back to the current lock stripping solutions IMHO. I can do some benchmarks if thats exactly what we need at this point Thanks! was (Author: vijay2...@yahoo.com): {quote} well I think you run into another issue which is that the ring buffer doesn't appear to check for queue full? {quote} Yeah i thought about it, we need to handle those and thats why didn't have it in the first place. Should not be really bad though. {quote} I don't agree that Unsafe couldn't do the exact same thing with no on heap references {quote} Probably, since we figured most of the implementation detail sure we can but still there is always many different ways to solve the problem (Even though it will be in efficient to copy multiple bytes to get to the next items in map etc... GC and CPU overhead would be more IMHO). For example Memcached used expiration time set by the clients to remove the items which made it easier for them to do the slab allocator but this is something we removed it in lruc and just a queue. {quote} I also wonder if splitting the cache into several instances each with a coarse lock per instance wouldn't result in simpler {quote} The problem there is how will you invalidate the last used items, since they are different partitions you really don't know which ones to invalidate... there is also a problem of load balancing when to expand the buckets etc which will bring us back to the current lock stripping solutions IMHO. I can do some benchmarks if thats exactly what we need at this point Thanks! Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196492#comment-14196492 ] Vijay edited comment on CASSANDRA-7438 at 11/4/14 6:16 PM: --- {quote} well I think you run into another issue which is that the ring buffer doesn't appear to check for queue full? {quote} Yeah i thought about it, we need to handle those and thats why didn't have it in the first place. Should not be really bad though. {quote} I don't agree that Unsafe couldn't do the exact same thing with no on heap references {quote} Probably, since we figured most of the implementation detail sure we can but still there is always many different ways to solve the problem (May not be very efficient to copy multiple bytes to get to the next item in map etc... GC and CPU overhead would be more IMHO). For example Memcached used expiration time set by the clients to remove the items which made it easier for them to do the slab allocator but this is something we removed it in lruc and just a queue. {quote} I also wonder if splitting the cache into several instances each with a coarse lock per instance wouldn't result in simpler {quote} The problem there is how will you invalidate the least used items, since they are different partitions you really don't know which ones to invalidate... there is also a problem of load balancing when to expand the buckets etc which will bring us back to the current lock stripping solutions IMHO. I can do some benchmarks if thats exactly what we need at this point Thanks! was (Author: vijay2...@yahoo.com): {quote} well I think you run into another issue which is that the ring buffer doesn't appear to check for queue full? {quote} Yeah i thought about it, we need to handle those and thats why didn't have it in the first place. Should not be really bad though. {quote} I don't agree that Unsafe couldn't do the exact same thing with no on heap references {quote} Probably, since we figured most of the implementation detail sure we can but still there is always many different ways to solve the problem (May not be very efficient to copy multiple bytes to get to the next item in map etc... GC and CPU overhead would be more IMHO). For example Memcached used expiration time set by the clients to remove the items which made it easier for them to do the slab allocator but this is something we removed it in lruc and just a queue. {quote} I also wonder if splitting the cache into several instances each with a coarse lock per instance wouldn't result in simpler {quote} The problem there is how will you invalidate the last used items, since they are different partitions you really don't know which ones to invalidate... there is also a problem of load balancing when to expand the buckets etc which will bring us back to the current lock stripping solutions IMHO. I can do some benchmarks if thats exactly what we need at this point Thanks! Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196492#comment-14196492 ] Vijay edited comment on CASSANDRA-7438 at 11/4/14 6:17 PM: --- {quote} well I think you run into another issue which is that the ring buffer doesn't appear to check for queue full? {quote} Yeah i thought about it, we need to handle those and thats why didn't have it in the first place. Should not be really bad though. {quote} I don't agree that Unsafe couldn't do the exact same thing with no on heap references {quote} Probably, since we figured most of the implementation detail sure we can but still there is always many different ways to solve the problem (May not be very efficient to copy multiple bytes to get to the next item in map etc... GC and CPU overhead would be more IMHO). For example Memcached used expiration time set by the clients to remove the items which made it easier for them to do the slab allocator but this is something we removed it in lruc and just a queue. {quote} I also wonder if splitting the cache into several instances each with a coarse lock per instance wouldn't result in simpler {quote} The problem there is how will you invalidate the least used items, since they are different partitions you really don't know which ones to invalidate... there is also a problem of load balancing when to expand the buckets etc which will bring us back to the current lock striping solutions IMHO. I can do some benchmarks if thats exactly what we need at this point Thanks! was (Author: vijay2...@yahoo.com): {quote} well I think you run into another issue which is that the ring buffer doesn't appear to check for queue full? {quote} Yeah i thought about it, we need to handle those and thats why didn't have it in the first place. Should not be really bad though. {quote} I don't agree that Unsafe couldn't do the exact same thing with no on heap references {quote} Probably, since we figured most of the implementation detail sure we can but still there is always many different ways to solve the problem (May not be very efficient to copy multiple bytes to get to the next item in map etc... GC and CPU overhead would be more IMHO). For example Memcached used expiration time set by the clients to remove the items which made it easier for them to do the slab allocator but this is something we removed it in lruc and just a queue. {quote} I also wonder if splitting the cache into several instances each with a coarse lock per instance wouldn't result in simpler {quote} The problem there is how will you invalidate the least used items, since they are different partitions you really don't know which ones to invalidate... there is also a problem of load balancing when to expand the buckets etc which will bring us back to the current lock stripping solutions IMHO. I can do some benchmarks if thats exactly what we need at this point Thanks! Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197029#comment-14197029 ] Vijay commented on CASSANDRA-7438: -- {quote} Aren't all those objections to the current design {quote} I am fine to make it configurable and maintain it in a separate project but i didn't realize that was the case. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195679#comment-14195679 ] Vijay commented on CASSANDRA-7438: -- Thanks for reviewing! {quote} I am also not clear on why locks are necessary for individual items. {quote} No we don't. We have locks per Segment, this is very similar to lock stripping or the smiler to Java's concurrent hash map. {quote} global lock in may_expire() quite frequently? {quote} Not really we lock globally when we reach 100% of the space and we freeup to 80% of the space and we spread the overhead to other threads based on who ever has the item partition lock. It won't be hard to make this part of the queue thread and will try it for the next release of lruc. {quote} What kind of hardware was the benchmark run on? {quote} 32 core 100GB RAM with numa and intel xeon. There is a benchmark util which is also checked in as a part of the lruc code which does exactly the same kind of test. {quote} You really need a plan for running something like Valgrind {quote} Good point, I was part way down that road and still have the code i can resuruct it for the next lruc version. {quote} I am not clear on why the JNI is justified {quote} There is some comments above which has the reasoning for it (please see the above comments). PS: I believe there was some tickets on Current RowCache complaining about the overhead. {quote} I think JNI would make more sense if we were pulling in existing code like memcached {quote} If you look at the code closer to memcached. Actually I started of stripping memcached code so we can run it in process instead of running as a separate process and removing the global locks in queue reallocation etc and eventually diverged too much from it. The other reason it doesn't use slab allocators is because we wanted the memory allocators to do the right thing we already have tested Cassandra with Jemalloc. To confort a bit lruc is running in our production already :) Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195679#comment-14195679 ] Vijay edited comment on CASSANDRA-7438 at 11/4/14 3:54 AM: --- Thanks for reviewing! {quote} I am also not clear on why locks are necessary for individual items. {quote} No we don't. We have locks per Segment, this is very similar to lock stripping/Java's concurrent hash map. {quote} global lock in may_expire() quite frequently? {quote} Not really we lock globally when we reach 100% of the space and we freeup to 80% of the space and we spread the overhead to other threads based on who ever has the item partition lock. It won't be hard to make this part of the queue thread and will try it for the next release of lruc. {quote} What kind of hardware was the benchmark run on? {quote} 32 core 100GB RAM with numa and intel xeon. There is a benchmark util which is also checked in as a part of the lruc code which does exactly the same kind of test. {quote} You really need a plan for running something like Valgrind {quote} Good point, I was part way down that road and still have the code i can resuruct it for the next lruc version. {quote} I am not clear on why the JNI is justified {quote} There is some comments above which has the reasoning for it (please see the above comments). PS: I believe there was some tickets on Current RowCache complaining about the overhead. {quote} I think JNI would make more sense if we were pulling in existing code like memcached {quote} If you look at the code closer to memcached. Actually I started of stripping memcached code so we can run it in process instead of running as a separate process and removing the global locks in queue reallocation etc and eventually diverged too much from it. The other reason it doesn't use slab allocators is because we wanted the memory allocators to do the right thing we already have tested Cassandra with Jemalloc. To confort a bit lruc is running in our production already :) was (Author: vijay2...@yahoo.com): Thanks for reviewing! {quote} I am also not clear on why locks are necessary for individual items. {quote} No we don't. We have locks per Segment, this is very similar to lock stripping or the smiler to Java's concurrent hash map. {quote} global lock in may_expire() quite frequently? {quote} Not really we lock globally when we reach 100% of the space and we freeup to 80% of the space and we spread the overhead to other threads based on who ever has the item partition lock. It won't be hard to make this part of the queue thread and will try it for the next release of lruc. {quote} What kind of hardware was the benchmark run on? {quote} 32 core 100GB RAM with numa and intel xeon. There is a benchmark util which is also checked in as a part of the lruc code which does exactly the same kind of test. {quote} You really need a plan for running something like Valgrind {quote} Good point, I was part way down that road and still have the code i can resuruct it for the next lruc version. {quote} I am not clear on why the JNI is justified {quote} There is some comments above which has the reasoning for it (please see the above comments). PS: I believe there was some tickets on Current RowCache complaining about the overhead. {quote} I think JNI would make more sense if we were pulling in existing code like memcached {quote} If you look at the code closer to memcached. Actually I started of stripping memcached code so we can run it in process instead of running as a separate process and removing the global locks in queue reallocation etc and eventually diverged too much from it. The other reason it doesn't use slab allocators is because we wanted the memory allocators to do the right thing we already have tested Cassandra with Jemalloc. To confort a bit lruc is running in our production already :) Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193760#comment-14193760 ] Vijay commented on CASSANDRA-7438: -- Pushed, Thanks! {quote} We should ensure that changes in the serialized format of saved row caches are detected {quote} I don't think we changed the format, did i? {quote} item.refcount - it refcount is updated, the whole cache line needs to be re-fetched (CPU) {quote} The refcount is per item in the cache, for every item inserted we track this in its memory location. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189364#comment-14189364 ] Vijay commented on CASSANDRA-7438: -- rebased, pushed with latest binaries. {quote} the comments in cassandra.yaml could be more fleshy (see below) {quote} Sorry my bad missed it before and thanks for the write up i just copied it into the fork {quote} recommend to use the latest lruc release in C* {quote} Yeah i did setup release and publishing to maven central, few weeks ago. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8188) don't block SocketThread for MessagingService
[ https://issues.apache.org/jira/browse/CASSANDRA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185796#comment-14185796 ] Vijay commented on CASSANDRA-8188: -- I had the same solution as a part of https://issues.apache.org/jira/secure/attachment/12623900/0001-CASSANDRA-6590.patch, but [~brandon.williams] was seeing some wiredness. I was not able to replicate the that though. don't block SocketThread for MessagingService - Key: CASSANDRA-8188 URL: https://issues.apache.org/jira/browse/CASSANDRA-8188 Project: Cassandra Issue Type: Improvement Components: Core Reporter: yangwei Assignee: yangwei Attachments: 0001-don-t-block-SocketThread-for-MessagingService.patch We have two datacenters A and B. The node in A cannot handshake version with nodes in B, logs in A as follow: {noformat} INFO [HANDSHAKE-/B] 2014-10-24 04:29:49,075 OutboundTcpConnection.java (line 395) Cannot handshake version with B TRACE [WRITE-/B] 2014-10-24 11:02:49,044 OutboundTcpConnection.java (line 368) unable to connect to /B java.net.ConnectException: Connection refused at sun.nio.ch.Net.connect0(Native Method) at sun.nio.ch.Net.connect(Net.java:364) at sun.nio.ch.Net.connect(Net.java:356) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623) at java.nio.channels.SocketChannel.open(SocketChannel.java:184) at org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:134) at org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:119) at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:299) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150) {noformat} The jstack output of nodes in B shows it blocks in inputStream.readInt resulting in SocketThread not accept socket any more, logs as follow: {noformat} java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) - locked 0x0007963747e8 (a java.lang.Object) at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203) - locked 0x000796374848 (a java.lang.Object) at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) - locked 0x0007a5c7ca88 (a sun.nio.ch.SocketAdaptor$SocketInputStream) at java.io.InputStream.read(InputStream.java:101) at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81) - locked 0x0007a5c7ca88 (a sun.nio.ch.SocketAdaptor$SocketInputStream) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:879) {noformat} In nodes of B tcpdump shows retransmission of SYN,ACK during the tcp three-way handshake phase because tcp implementation drops the last ack when the backlog queue is full. In nodes of B ss -tl shows Recv-Q 51 Send-Q 50. In nodes of B netstat -s shows “SYNs to LISTEN sockets dropped” and “times the listen queue of a socket overflowed” are both increasing. This patch sets read timeout to 2 * OutboundTcpConnection.WAIT_FOR_VERSION_MAX_TIME for the accepted socket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
git commit: fix typo in TriggerExecutor patch by Liang Xie; reviewed by Vijay for CASSANDRA-8184
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 7cf3f19be - 658a65b28 fix typo in TriggerExecutor patch by Liang Xie; reviewed by Vijay for CASSANDRA-8184 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/658a65b2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/658a65b2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/658a65b2 Branch: refs/heads/cassandra-2.1 Commit: 658a65b281a2ce11b94eb7a60f54de02a1e85342 Parents: 7cf3f19 Author: Vijay vijay2...@gmail.com Authored: Fri Oct 24 09:19:40 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Fri Oct 24 09:19:40 2014 -0700 -- src/java/org/apache/cassandra/triggers/TriggerExecutor.java | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/658a65b2/src/java/org/apache/cassandra/triggers/TriggerExecutor.java -- diff --git a/src/java/org/apache/cassandra/triggers/TriggerExecutor.java b/src/java/org/apache/cassandra/triggers/TriggerExecutor.java index 001529d..677daad 100644 --- a/src/java/org/apache/cassandra/triggers/TriggerExecutor.java +++ b/src/java/org/apache/cassandra/triggers/TriggerExecutor.java @@ -53,10 +53,10 @@ public class TriggerExecutor */ public void reloadClasses() { -File tiggerDirectory = FBUtilities.cassandraTriggerDir(); -if (tiggerDirectory == null) +File triggerDirectory = FBUtilities.cassandraTriggerDir(); +if (triggerDirectory == null) return; -customClassLoader = new CustomClassLoader(parent, tiggerDirectory); +customClassLoader = new CustomClassLoader(parent, triggerDirectory); cachedTriggers.clear(); }
[2/2] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2976e693 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2976e693 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2976e693 Branch: refs/heads/trunk Commit: 2976e69372d261db8275e3cdba28669fab7f3c89 Parents: 9089008 658a65b Author: Vijay vijay2...@gmail.com Authored: Fri Oct 24 09:24:05 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Fri Oct 24 09:24:05 2014 -0700 -- src/java/org/apache/cassandra/triggers/TriggerExecutor.java | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/2976e693/src/java/org/apache/cassandra/triggers/TriggerExecutor.java --
[1/2] git commit: fix typo in TriggerExecutor patch by Liang Xie; reviewed by Vijay for CASSANDRA-8184
Repository: cassandra Updated Branches: refs/heads/trunk 908900800 - 2976e6937 fix typo in TriggerExecutor patch by Liang Xie; reviewed by Vijay for CASSANDRA-8184 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/658a65b2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/658a65b2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/658a65b2 Branch: refs/heads/trunk Commit: 658a65b281a2ce11b94eb7a60f54de02a1e85342 Parents: 7cf3f19 Author: Vijay vijay2...@gmail.com Authored: Fri Oct 24 09:19:40 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Fri Oct 24 09:19:40 2014 -0700 -- src/java/org/apache/cassandra/triggers/TriggerExecutor.java | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/658a65b2/src/java/org/apache/cassandra/triggers/TriggerExecutor.java -- diff --git a/src/java/org/apache/cassandra/triggers/TriggerExecutor.java b/src/java/org/apache/cassandra/triggers/TriggerExecutor.java index 001529d..677daad 100644 --- a/src/java/org/apache/cassandra/triggers/TriggerExecutor.java +++ b/src/java/org/apache/cassandra/triggers/TriggerExecutor.java @@ -53,10 +53,10 @@ public class TriggerExecutor */ public void reloadClasses() { -File tiggerDirectory = FBUtilities.cassandraTriggerDir(); -if (tiggerDirectory == null) +File triggerDirectory = FBUtilities.cassandraTriggerDir(); +if (triggerDirectory == null) return; -customClassLoader = new CustomClassLoader(parent, tiggerDirectory); +customClassLoader = new CustomClassLoader(parent, triggerDirectory); cachedTriggers.clear(); }
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160044#comment-14160044 ] Vijay commented on CASSANDRA-7438: -- Pushed most of the changes to https://github.com/Vijay2win/cassandra/commits/7438, not sure moving the tests and code into cassandra code base (since i am really neutral on that). Other related changes, tests and refactor is pushed as a part of 3 main commits in https://github.com/Vijay2win/lruc/commits/master. cc [~xedin] Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14158005#comment-14158005 ] Vijay commented on CASSANDRA-7438: -- Hi Jonathan, yes, i am adding more tests and fixing a test failure to lruc going to post the patch soon. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144260#comment-14144260 ] Vijay commented on CASSANDRA-7438: -- Hi [~rst...@pironet-ndh.com], I dont see a problem in copying the code or rewriting the code, once you complete the rest of the review we can see what we can do. I am guessing you where not waiting for my response :) Thanks! Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7897) NodeTool command to display OffHeap memory usage
[ https://issues.apache.org/jira/browse/CASSANDRA-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-7897: - Attachment: 0001-CASSANDRA-7897.patch Attached is a simple patch for #nodetool memstats {quote} Off-heap memory used for Keyspace1/Standard1 Bloom Filter: 428.53 KB Index Summary : 59.28 KB Memtable size : 0 bytes Other Off-heap (Total) All Row Cache size : 0 bytes All Memtable size : 0 bytes Other On-heap (Total) All Key Cache size : 1.21 KB All Memtable size : 45.77 MB {quote} NodeTool command to display OffHeap memory usage Key: CASSANDRA-7897 URL: https://issues.apache.org/jira/browse/CASSANDRA-7897 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Vijay Assignee: Vijay Priority: Minor Attachments: 0001-CASSANDRA-7897.patch Most of the highest memory consuming data structure in Cassandra is now off-heap. It will be nice to display the memory used by BF's, Index Summaries, FS Buffers, Caches and Memtables (when enabled) This ticket is to track and display off heap memory allocation/used by running Cassandra process, this will help users to further tune the memory used by these data structures per CF. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-7897) NodeTool command to display OffHeap memory usage
Vijay created CASSANDRA-7897: Summary: NodeTool command to display OffHeap memory usage Key: CASSANDRA-7897 URL: https://issues.apache.org/jira/browse/CASSANDRA-7897 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Vijay Assignee: Vijay Priority: Minor Most of the highest memory consuming data structure in Cassandra is now off-heap. It will be nice to display the memory used by BF's, Index Summaries, FS Buffers, Caches and Memtables (when enabled) This ticket is to track and display off heap memory allocation/used by running Cassandra process, this will help users to further tune the memory used by these data structures per CF. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075676#comment-14075676 ] Vijay commented on CASSANDRA-7438: -- Pushed the branch to https://github.com/Vijay2win/cassandra/tree/7438 {quote}Maybe delay Win port{quote} We should be fine, lruc is configurable with the Serialization cache. {quote}unclean shutdown (kill -9) does not delete the so/dylib file{quote} Yeah it works in unix, but the problem is i don't have a handle since its a temp file after restart. So it is a best effort for cleanups. {quote}SWIGTYPE_p_item and SWIGTYPE_p_p_item are unused {quote} Auto generated and can be removed but will be generated every time swig is run. {quote}Generally the lruc code could be more integrated in C* code{quote} The problem is it produces a circular dependency, please look at df3857e4b9637ed6a5099506e95d84de15bf2eb7 where i removed those (the DOSP added back will still need to wrapped around by Cassandra's DOSP). {quote}Naming of max_size, capacity{quote} Yeah let me make it consistent, the problem was i was trying to fit everything into Guava interface. {quote}remove hotN or return an array/list instead{quote} Or may be do memcpy on keys, since this doesn't need optimization (will fix). {quote}shouldn't there be something like a yield{quote} Actually i removed it recently adding or removing doesn't give much performance gains, as a good citizen should add it back. {quote}Seems like the C code was not cleaned up{quote} This cannot be removed and needed for test cases. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075516#comment-14075516 ] Vijay commented on CASSANDRA-7438: -- {quote} unsafe.memoryAllocate instead and replicate what we do with lruc_item_allocate() {quote} Done, Thanks! Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072250#comment-14072250 ] Vijay commented on CASSANDRA-7438: -- {quote} for starters, the first 1s in each gives 10x the throughput, and each is then followed by some weirdly lengthy random pauses. SerializingCacheProvider sees a 21s latency pause {quote} Not sure about which 21 second pause, are you talking about thread creation takes time specially during startup? Anyways yes when the concurrency is lower here is the http://pastebin.com/7fdY7kU1 {quote} The SerializingCacheProvider actually has higher total throughput as well {quote} Obviously the overhead will be more because now we are serializing the row key also off heap and there is safe points when calling JNI, but Let me clarify the goal for the patch is not to run OOM when the cache size is bigger than the heap. If interested i can run some benchmarks to show that. {quote} Is this all over CQL3 native {quote} Yep it is thrift, and here is CQL3 w/ prepared. http://pastebin.com/auQkH325 Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072819#comment-14072819 ] Vijay commented on CASSANDRA-7438: -- {quote} This implementation allocates upfront a ring-buffer, after which the entries are allocated through a regular variant of malloc/free. {quote} Not sure what we are talking about, this == lurc? if yes the RB is fronting the queue so we don't need a global lock. Robert, Thanks! {quote} Integration tests should be done on most important platforms like Linux x64, OSX, Win64 {quote} Win64 will be awesome if you generate the dll for lruc. please send me the github username and will add you in. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069862#comment-14069862 ] Vijay commented on CASSANDRA-7438: -- Attached patch makes the off heap/Serialization Cache configurable. (the default is still SerializationCache). Regarding performance, the performance of the new cache is obviously better when the JNI overhead is less than the GC overhead, but for the smaller sized caches which can fit in memory the performance is little lower which is understandable (but both of them out perform pagecache performance by a large number). Here are the numbers. OffheapCacheProvider Running READ with 1200 threads for 1000 iterations ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 2030355 , 2029531, 2029531, 3.1, 3.1, 5.4, 5.7,61.8, 3014.5,1.0, 0.0 2395480 , 202845, 202845, 5.8, 5.4, 5.8,20.2, 522.4, 545.9,2.8, 0.0 2638600 , 221368, 221368, 5.4, 5.3, 5.8,16.3,78.8, 131.5,3.9, 0.57860 2891705 , 221976, 221976, 5.4, 5.3, 5.6, 6.2,15.2, 19.2,5.0, 0.60478 3147747 , 222527, 222527, 5.4, 5.3, 5.6, 6.1,15.4, 18.2,6.2, 0.58659 3394999 , 221527, 221527, 5.4, 5.3, 5.6, 6.6,15.9, 19.4,7.3, 0.55884 3663559 , 226114, 226114, 5.3, 5.2, 5.6,15.0,84.4, 110.7,8.5, 0.52924 3911154 , 223831, 223831, 5.4, 5.3, 5.6, 6.1,15.6, 20.0,9.6, 0.50018 4152946 , 223246, 223246, 5.4, 5.3, 5.6, 6.1,15.7, 18.8, 10.7, 0.47323 4403162 , 228532, 228532, 5.2, 5.2, 5.6,23.2, 107.4, 121.4, 11.8, 0.44856 4641021 , 225196, 225196, 5.3, 5.2, 5.6, 5.9,15.3, 18.4, 12.8, 0.42557 4889523 , 222826, 222826, 5.4, 5.3, 5.6, 6.3,16.2, 22.0, 13.9, 0.40476 5124891 , 223203, 223203, 5.4, 5.3, 5.6, 5.8, 6.2, 14.8, 15.0, 0.38602 5375262 , 221222, 221222, 5.4, 5.2, 5.6,18.4,94.2, 115.1, 16.1, 0.36899 5616470 , 224022, 224022, 5.4, 5.3, 5.6, 5.9,14.3, 17.8, 17.2, 0.35349 5866825 , 223000, 223000, 5.4, 5.3, 5.6, 6.1,15.5, 18.2, 18.3, 0.33882 6125601 , 225757, 225757, 5.2, 5.3, 5.6, 9.6,49.4, 72.0, 19.5, 0.32535 6348030 , 192703, 192703, 6.3, 5.3, 9.3,14.4,77.1, 91.5, 20.6, 0.31282 6483574 , 128520, 128520, 9.3, 8.4,10.9,19.5,88.7, 99.0, 21.7, 0.30329 6626176 , 137199, 137199, 8.7, 8.4,10.6,14.0,32.7, 40.9, 22.7, 0.29771 6768401 , 136860, 136860, 8.8, 8.4,10.3,14.1,35.1, 40.8, 23.8, 0.29213 6911785 , 138204, 138204, 8.7, 8.3,10.2,13.7,34.1, 37.8, 24.8, 0.28669 7055951 , 138633, 138633, 8.7, 8.3,10.5,32.0,40.5, 46.9, 25.8, 0.28130 7199084 , 137731, 137731, 8.7, 8.4,10.2,14.0,33.4, 40.9, 26.9, 0.27623 7338032 , 133201, 133201, 9.0, 8.4,10.9,34.0,39.4, 43.8, 27.9, 0.27116 7480439 , 137059, 137059, 8.8, 8.4,10.2,13.9,35.9, 39.5, 29.0, 0.26663 7647810 , 161209, 161209, 7.5, 7.8, 9.6,13.4,33.9, 77.9, 30.0, 0.26185 7898882 , 226498, 226498, 5.3, 5.2, 5.6,19.7, 108.5, 119.3, 31.1, 0.25629 8136305 , 223840, 223840, 5.4, 5.3, 5.6, 5.9,17.3, 23.2, 32.2, 0.24838 8372076 , 223790, 223790, 5.4, 5.3, 5.6, 6.0,15.2, 20.0, 33.2, 0.24095 8633758 , 232914, 232914, 5.1, 5.2, 5.6,17.5, 138.4, 182.0, 34.4, 0.23397 8869214 , 43, 43, 5.4, 5.3, 5.6, 6.0,15.2, 17.9, 35.4, 0.22717 9121652 , 223037, 223037, 5.4, 5.3, 5.6, 5.9,15.4, 18.8, 36.5, 0.22105 9360286 , 225070, 225070, 5.3, 5.3, 5.6,14.8,82.7, 92.1, 37.6, 0.21524 9609676 , 224089, 224089, 5.4, 5.3, 5.6, 5.8, 6.2, 14.3, 38.7, 0.20967 9848551 , 222123, 222123, 5.4, 5.3, 5.6, 5.9,24.2, 27.2, 39.8, 0.20440 1000 , 229511, 229511, 5.0, 5.2, 5.8,60.0,74.3, 132.0, 40.5, 0.19935 Results: real op rate : 247211 adjusted op rate stderr : 0 key rate : 247211 latency mean : 5.4 latency median: 3.5 latency 95th percentile : 5.5 latency 99th percentile : 6.1 latency 99.9th percentile : 83.4 latency max : 3014.5 Total operation
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069862#comment-14069862 ] Vijay edited comment on CASSANDRA-7438 at 7/22/14 5:49 AM: --- Attached patch makes the off heap/Serialization Cache configurable. (the default is still SerializationCache). Regarding performance, the performance of the new cache is obviously better when the JNI overhead is less than the GC overhead, but for the smaller sized caches which can fit in memory the performance is little lower which is understandable (but both of them out perform pagecache performance by a large number). Here are the numbers. *OffheapCacheProvider* {panel} Running READ with 1200 threads for 1000 iterations ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 2030355 , 2029531, 2029531, 3.1, 3.1, 5.4, 5.7,61.8, 3014.5,1.0, 0.0 2395480 , 202845, 202845, 5.8, 5.4, 5.8,20.2, 522.4, 545.9,2.8, 0.0 2638600 , 221368, 221368, 5.4, 5.3, 5.8,16.3,78.8, 131.5,3.9, 0.57860 2891705 , 221976, 221976, 5.4, 5.3, 5.6, 6.2,15.2, 19.2,5.0, 0.60478 3147747 , 222527, 222527, 5.4, 5.3, 5.6, 6.1,15.4, 18.2,6.2, 0.58659 3394999 , 221527, 221527, 5.4, 5.3, 5.6, 6.6,15.9, 19.4,7.3, 0.55884 3663559 , 226114, 226114, 5.3, 5.2, 5.6,15.0,84.4, 110.7,8.5, 0.52924 3911154 , 223831, 223831, 5.4, 5.3, 5.6, 6.1,15.6, 20.0,9.6, 0.50018 4152946 , 223246, 223246, 5.4, 5.3, 5.6, 6.1,15.7, 18.8, 10.7, 0.47323 4403162 , 228532, 228532, 5.2, 5.2, 5.6,23.2, 107.4, 121.4, 11.8, 0.44856 4641021 , 225196, 225196, 5.3, 5.2, 5.6, 5.9,15.3, 18.4, 12.8, 0.42557 4889523 , 222826, 222826, 5.4, 5.3, 5.6, 6.3,16.2, 22.0, 13.9, 0.40476 5124891 , 223203, 223203, 5.4, 5.3, 5.6, 5.8, 6.2, 14.8, 15.0, 0.38602 5375262 , 221222, 221222, 5.4, 5.2, 5.6,18.4,94.2, 115.1, 16.1, 0.36899 5616470 , 224022, 224022, 5.4, 5.3, 5.6, 5.9,14.3, 17.8, 17.2, 0.35349 5866825 , 223000, 223000, 5.4, 5.3, 5.6, 6.1,15.5, 18.2, 18.3, 0.33882 6125601 , 225757, 225757, 5.2, 5.3, 5.6, 9.6,49.4, 72.0, 19.5, 0.32535 6348030 , 192703, 192703, 6.3, 5.3, 9.3,14.4,77.1, 91.5, 20.6, 0.31282 6483574 , 128520, 128520, 9.3, 8.4,10.9,19.5,88.7, 99.0, 21.7, 0.30329 6626176 , 137199, 137199, 8.7, 8.4,10.6,14.0,32.7, 40.9, 22.7, 0.29771 6768401 , 136860, 136860, 8.8, 8.4,10.3,14.1,35.1, 40.8, 23.8, 0.29213 6911785 , 138204, 138204, 8.7, 8.3,10.2,13.7,34.1, 37.8, 24.8, 0.28669 7055951 , 138633, 138633, 8.7, 8.3,10.5,32.0,40.5, 46.9, 25.8, 0.28130 7199084 , 137731, 137731, 8.7, 8.4,10.2,14.0,33.4, 40.9, 26.9, 0.27623 7338032 , 133201, 133201, 9.0, 8.4,10.9,34.0,39.4, 43.8, 27.9, 0.27116 7480439 , 137059, 137059, 8.8, 8.4,10.2,13.9,35.9, 39.5, 29.0, 0.26663 7647810 , 161209, 161209, 7.5, 7.8, 9.6,13.4,33.9, 77.9, 30.0, 0.26185 7898882 , 226498, 226498, 5.3, 5.2, 5.6,19.7, 108.5, 119.3, 31.1, 0.25629 8136305 , 223840, 223840, 5.4, 5.3, 5.6, 5.9,17.3, 23.2, 32.2, 0.24838 8372076 , 223790, 223790, 5.4, 5.3, 5.6, 6.0,15.2, 20.0, 33.2, 0.24095 8633758 , 232914, 232914, 5.1, 5.2, 5.6,17.5, 138.4, 182.0, 34.4, 0.23397 8869214 , 43, 43, 5.4, 5.3, 5.6, 6.0,15.2, 17.9, 35.4, 0.22717 9121652 , 223037, 223037, 5.4, 5.3, 5.6, 5.9,15.4, 18.8, 36.5, 0.22105 9360286 , 225070, 225070, 5.3, 5.3, 5.6,14.8,82.7, 92.1, 37.6, 0.21524 9609676 , 224089, 224089, 5.4, 5.3, 5.6, 5.8, 6.2, 14.3, 38.7, 0.20967 9848551 , 222123, 222123, 5.4, 5.3, 5.6, 5.9,24.2, 27.2, 39.8, 0.20440 1000 , 229511, 229511, 5.0, 5.2, 5.8,60.0,74.3, 132.0, 40.5, 0.19935 Results: real op rate : 247211 adjusted op rate stderr : 0 key rate : 247211 latency mean : 5.4 latency median: 3.5 latency 95th percentile : 5.5 latency 99th percentile : 6.1 latency 99.9th percentile : 83.4 latency max
[jira] [Updated] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-7438: - Attachment: 0001-CASSANDRA-7438.patch Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Attachments: 0001-CASSANDRA-7438.patch Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069862#comment-14069862 ] Vijay edited comment on CASSANDRA-7438 at 7/22/14 5:51 AM: --- Attached patch makes the off heap/Serialization Cache configurable. (the default is still SerializationCache). Regarding performance, the performance of the new cache is obviously better when the JNI overhead is less than the GC overhead, but for the smaller sized caches which can fit in memory the performance is little lower which is understandable (but both of them out perform pagecache performance by a large number). Here are the numbers. *OffheapCacheProvider* {panel} Running READ with 1200 threads for 1000 iterations ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 2030355 , 2029531, 2029531, 3.1, 3.1, 5.4, 5.7,61.8, 3014.5,1.0, 0.0 2395480 , 202845, 202845, 5.8, 5.4, 5.8,20.2, 522.4, 545.9,2.8, 0.0 2638600 , 221368, 221368, 5.4, 5.3, 5.8,16.3,78.8, 131.5,3.9, 0.57860 2891705 , 221976, 221976, 5.4, 5.3, 5.6, 6.2,15.2, 19.2,5.0, 0.60478 3147747 , 222527, 222527, 5.4, 5.3, 5.6, 6.1,15.4, 18.2,6.2, 0.58659 3394999 , 221527, 221527, 5.4, 5.3, 5.6, 6.6,15.9, 19.4,7.3, 0.55884 3663559 , 226114, 226114, 5.3, 5.2, 5.6,15.0,84.4, 110.7,8.5, 0.52924 3911154 , 223831, 223831, 5.4, 5.3, 5.6, 6.1,15.6, 20.0,9.6, 0.50018 4152946 , 223246, 223246, 5.4, 5.3, 5.6, 6.1,15.7, 18.8, 10.7, 0.47323 4403162 , 228532, 228532, 5.2, 5.2, 5.6,23.2, 107.4, 121.4, 11.8, 0.44856 4641021 , 225196, 225196, 5.3, 5.2, 5.6, 5.9,15.3, 18.4, 12.8, 0.42557 4889523 , 222826, 222826, 5.4, 5.3, 5.6, 6.3,16.2, 22.0, 13.9, 0.40476 5124891 , 223203, 223203, 5.4, 5.3, 5.6, 5.8, 6.2, 14.8, 15.0, 0.38602 5375262 , 221222, 221222, 5.4, 5.2, 5.6,18.4,94.2, 115.1, 16.1, 0.36899 5616470 , 224022, 224022, 5.4, 5.3, 5.6, 5.9,14.3, 17.8, 17.2, 0.35349 5866825 , 223000, 223000, 5.4, 5.3, 5.6, 6.1,15.5, 18.2, 18.3, 0.33882 6125601 , 225757, 225757, 5.2, 5.3, 5.6, 9.6,49.4, 72.0, 19.5, 0.32535 6348030 , 192703, 192703, 6.3, 5.3, 9.3,14.4,77.1, 91.5, 20.6, 0.31282 6483574 , 128520, 128520, 9.3, 8.4,10.9,19.5,88.7, 99.0, 21.7, 0.30329 6626176 , 137199, 137199, 8.7, 8.4,10.6,14.0,32.7, 40.9, 22.7, 0.29771 6768401 , 136860, 136860, 8.8, 8.4,10.3,14.1,35.1, 40.8, 23.8, 0.29213 6911785 , 138204, 138204, 8.7, 8.3,10.2,13.7,34.1, 37.8, 24.8, 0.28669 7055951 , 138633, 138633, 8.7, 8.3,10.5,32.0,40.5, 46.9, 25.8, 0.28130 7199084 , 137731, 137731, 8.7, 8.4,10.2,14.0,33.4, 40.9, 26.9, 0.27623 7338032 , 133201, 133201, 9.0, 8.4,10.9,34.0,39.4, 43.8, 27.9, 0.27116 7480439 , 137059, 137059, 8.8, 8.4,10.2,13.9,35.9, 39.5, 29.0, 0.26663 7647810 , 161209, 161209, 7.5, 7.8, 9.6,13.4,33.9, 77.9, 30.0, 0.26185 7898882 , 226498, 226498, 5.3, 5.2, 5.6,19.7, 108.5, 119.3, 31.1, 0.25629 8136305 , 223840, 223840, 5.4, 5.3, 5.6, 5.9,17.3, 23.2, 32.2, 0.24838 8372076 , 223790, 223790, 5.4, 5.3, 5.6, 6.0,15.2, 20.0, 33.2, 0.24095 8633758 , 232914, 232914, 5.1, 5.2, 5.6,17.5, 138.4, 182.0, 34.4, 0.23397 8869214 , 43, 43, 5.4, 5.3, 5.6, 6.0,15.2, 17.9, 35.4, 0.22717 9121652 , 223037, 223037, 5.4, 5.3, 5.6, 5.9,15.4, 18.8, 36.5, 0.22105 9360286 , 225070, 225070, 5.3, 5.3, 5.6,14.8,82.7, 92.1, 37.6, 0.21524 9609676 , 224089, 224089, 5.4, 5.3, 5.6, 5.8, 6.2, 14.3, 38.7, 0.20967 9848551 , 222123, 222123, 5.4, 5.3, 5.6, 5.9,24.2, 27.2, 39.8, 0.20440 1000 , 229511, 229511, 5.0, 5.2, 5.8,60.0,74.3, 132.0, 40.5, 0.19935 Results: real op rate : 247211 adjusted op rate stderr : 0 key rate : 247211 latency mean : 5.4 latency median: 3.5 latency 95th percentile : 5.5 latency 99th percentile : 6.1 latency 99.9th percentile : 83.4 latency max
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069862#comment-14069862 ] Vijay edited comment on CASSANDRA-7438 at 7/22/14 5:54 AM: --- Attached patch makes the off heap/Serialization Cache configurable. (the default is still SerializationCache). Regarding performance, the performance of the new cache is obviously better when the JNI overhead is less than the GC overhead. For smaller size cache that can fit in the JVM heap, the performance is a little lower understandablely (but both of them out perform pagecache performance by a large number). Here are the numbers. *OffheapCacheProvider* {panel} Running READ with 1200 threads for 1000 iterations ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 2030355 , 2029531, 2029531, 3.1, 3.1, 5.4, 5.7,61.8, 3014.5,1.0, 0.0 2395480 , 202845, 202845, 5.8, 5.4, 5.8,20.2, 522.4, 545.9,2.8, 0.0 2638600 , 221368, 221368, 5.4, 5.3, 5.8,16.3,78.8, 131.5,3.9, 0.57860 2891705 , 221976, 221976, 5.4, 5.3, 5.6, 6.2,15.2, 19.2,5.0, 0.60478 3147747 , 222527, 222527, 5.4, 5.3, 5.6, 6.1,15.4, 18.2,6.2, 0.58659 3394999 , 221527, 221527, 5.4, 5.3, 5.6, 6.6,15.9, 19.4,7.3, 0.55884 3663559 , 226114, 226114, 5.3, 5.2, 5.6,15.0,84.4, 110.7,8.5, 0.52924 3911154 , 223831, 223831, 5.4, 5.3, 5.6, 6.1,15.6, 20.0,9.6, 0.50018 4152946 , 223246, 223246, 5.4, 5.3, 5.6, 6.1,15.7, 18.8, 10.7, 0.47323 4403162 , 228532, 228532, 5.2, 5.2, 5.6,23.2, 107.4, 121.4, 11.8, 0.44856 4641021 , 225196, 225196, 5.3, 5.2, 5.6, 5.9,15.3, 18.4, 12.8, 0.42557 4889523 , 222826, 222826, 5.4, 5.3, 5.6, 6.3,16.2, 22.0, 13.9, 0.40476 5124891 , 223203, 223203, 5.4, 5.3, 5.6, 5.8, 6.2, 14.8, 15.0, 0.38602 5375262 , 221222, 221222, 5.4, 5.2, 5.6,18.4,94.2, 115.1, 16.1, 0.36899 5616470 , 224022, 224022, 5.4, 5.3, 5.6, 5.9,14.3, 17.8, 17.2, 0.35349 5866825 , 223000, 223000, 5.4, 5.3, 5.6, 6.1,15.5, 18.2, 18.3, 0.33882 6125601 , 225757, 225757, 5.2, 5.3, 5.6, 9.6,49.4, 72.0, 19.5, 0.32535 6348030 , 192703, 192703, 6.3, 5.3, 9.3,14.4,77.1, 91.5, 20.6, 0.31282 6483574 , 128520, 128520, 9.3, 8.4,10.9,19.5,88.7, 99.0, 21.7, 0.30329 6626176 , 137199, 137199, 8.7, 8.4,10.6,14.0,32.7, 40.9, 22.7, 0.29771 6768401 , 136860, 136860, 8.8, 8.4,10.3,14.1,35.1, 40.8, 23.8, 0.29213 6911785 , 138204, 138204, 8.7, 8.3,10.2,13.7,34.1, 37.8, 24.8, 0.28669 7055951 , 138633, 138633, 8.7, 8.3,10.5,32.0,40.5, 46.9, 25.8, 0.28130 7199084 , 137731, 137731, 8.7, 8.4,10.2,14.0,33.4, 40.9, 26.9, 0.27623 7338032 , 133201, 133201, 9.0, 8.4,10.9,34.0,39.4, 43.8, 27.9, 0.27116 7480439 , 137059, 137059, 8.8, 8.4,10.2,13.9,35.9, 39.5, 29.0, 0.26663 7647810 , 161209, 161209, 7.5, 7.8, 9.6,13.4,33.9, 77.9, 30.0, 0.26185 7898882 , 226498, 226498, 5.3, 5.2, 5.6,19.7, 108.5, 119.3, 31.1, 0.25629 8136305 , 223840, 223840, 5.4, 5.3, 5.6, 5.9,17.3, 23.2, 32.2, 0.24838 8372076 , 223790, 223790, 5.4, 5.3, 5.6, 6.0,15.2, 20.0, 33.2, 0.24095 8633758 , 232914, 232914, 5.1, 5.2, 5.6,17.5, 138.4, 182.0, 34.4, 0.23397 8869214 , 43, 43, 5.4, 5.3, 5.6, 6.0,15.2, 17.9, 35.4, 0.22717 9121652 , 223037, 223037, 5.4, 5.3, 5.6, 5.9,15.4, 18.8, 36.5, 0.22105 9360286 , 225070, 225070, 5.3, 5.3, 5.6,14.8,82.7, 92.1, 37.6, 0.21524 9609676 , 224089, 224089, 5.4, 5.3, 5.6, 5.8, 6.2, 14.3, 38.7, 0.20967 9848551 , 222123, 222123, 5.4, 5.3, 5.6, 5.9,24.2, 27.2, 39.8, 0.20440 1000 , 229511, 229511, 5.0, 5.2, 5.8,60.0,74.3, 132.0, 40.5, 0.19935 Results: real op rate : 247211 adjusted op rate stderr : 0 key rate : 247211 latency mean : 5.4 latency median: 3.5 latency 95th percentile : 5.5 latency 99th percentile : 6.1 latency 99.9th percentile : 83.4 latency max
[jira] [Commented] (CASSANDRA-7125) Fail to start by default if Commit Log fails to validate any messages
[ https://issues.apache.org/jira/browse/CASSANDRA-7125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052643#comment-14052643 ] Vijay commented on CASSANDRA-7125: -- Forgot to mention i am +1 otherwise. Thanks! One more Nit: May be it might look better if we change IGNORE_ERRORS to STOP_ON_ERRORS or something like that... Fail to start by default if Commit Log fails to validate any messages - Key: CASSANDRA-7125 URL: https://issues.apache.org/jira/browse/CASSANDRA-7125 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: correctness Fix For: 2.1.1 Current behaviour can be pretty dangerous, and also has a tendency to mask bugs during development. We should change the behaviour to default to failure if anything unexpected happens, and introduce a cassandra.yaml option that permits overriding the default behaviour. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7125) Fail to start by default if Commit Log fails to validate any messages
[ https://issues.apache.org/jira/browse/CASSANDRA-7125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051139#comment-14051139 ] Vijay commented on CASSANDRA-7125: -- Hi [~benedict], is the changes to CommitLogSegment.recycle needed? Nit: CommitLogDescriptor changes is not needed too (as the scoping was right earlier). CassandraDaemon changes might also be not needed since we print the warn statement and exit out with a stack trace. Fail to start by default if Commit Log fails to validate any messages - Key: CASSANDRA-7125 URL: https://issues.apache.org/jira/browse/CASSANDRA-7125 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: correctness Fix For: 2.1.1 Current behaviour can be pretty dangerous, and also has a tendency to mask bugs during development. We should change the behaviour to default to failure if anything unexpected happens, and introduce a cassandra.yaml option that permits overriding the default behaviour. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7125) Fail to start by default if Commit Log fails to validate any messages
[ https://issues.apache.org/jira/browse/CASSANDRA-7125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050525#comment-14050525 ] Vijay commented on CASSANDRA-7125: -- Will do, Thaks! Fail to start by default if Commit Log fails to validate any messages - Key: CASSANDRA-7125 URL: https://issues.apache.org/jira/browse/CASSANDRA-7125 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: correctness Fix For: 2.1.1 Current behaviour can be pretty dangerous, and also has a tendency to mask bugs during development. We should change the behaviour to default to failure if anything unexpected happens, and introduce a cassandra.yaml option that permits overriding the default behaviour. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048590#comment-14048590 ] Vijay commented on CASSANDRA-7438: -- {quote} I can attest that this assumption becomes false in practice {quote} Fixed! :) Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048443#comment-14048443 ] Vijay commented on CASSANDRA-7438: -- Thanks Robert, Pushed the fix for 1 and 2. {quote} Utils.md5sum() should use thread local instances of MessageDigest {quote} Md5 is fine since this happens only when initialized for the first time when the cache is initialized and it should be fine. {quote} the mechanism to extract the so/dylib is a bit error prone since all JVM instances use the same path name. {quote} IMHO, Its a safe assumption that we don't support multiple versions on the same physical box (of the library) on the same machine at the same time. {quote} counters/timers in C code to track global locks {quote} NOTE: this is only during the hash table expansions. (not during steady state) Created a ticket to track it in the near future... May want to do something little fancy like Yammer Metrics {quote} usedhighWaterMark, where highWaterMark is for example 80% of max {quote} I initially tried with the queue thread, to keep track of usage and expire proactively, but I had to involves a global lock and we might end up over committing memory, so may be in the future. (Current implementation works similar to CLHM) Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046976#comment-14046976 ] Vijay edited comment on CASSANDRA-7438 at 6/28/14 9:50 PM: --- Pushed a new project to github https://github.com/Vijay2win/lruc, including benchmark utils. I can move the code to Cassandra repo or use it as a library in Cassandra (Working on it). was (Author: vijay2...@yahoo.com): Pushed a new project in github https://github.com/Vijay2win/lruc, including benchmark utils. I can move the code to Cassandra repo or use it as a library in Cassandra (Working on it). Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046976#comment-14046976 ] Vijay commented on CASSANDRA-7438: -- Pushed a new project in github https://github.com/Vijay2win/lruc, including benchmark utils. I can move the code to Cassandra repo or use it as a library in Cassandra (Working on it). Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Labels: performance Fix For: 3.0 Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
Vijay created CASSANDRA-7438: Summary: Serializing Row cache alternative (Fully off heap) Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Fix For: 3.0 Current off heap row cache is just partially off heap the keys are still stored in memory, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires carful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-7438: - Description: Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires carful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. was: Current off heap row cache is just partially off heap the keys are still stored in memory, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires carful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Fix For: 3.0 Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires carful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-7438: - Description: Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. was: Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires carful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. Serializing Row cache alternative (Fully off heap) -- Key: CASSANDRA-7438 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 Project: Cassandra Issue Type: Improvement Components: Core Environment: Linux Reporter: Vijay Assignee: Vijay Fix For: 3.0 Currently SerializingCache is partially off heap, keys are still stored in JVM heap as BB, * There is a higher GC costs for a reasonably big cache. * Some users have used the row cache efficiently in production for better results, but this requires careful tunning. * Overhead in Memory for the cache entries are relatively high. So the proposal for this ticket is to move the LRU cache logic completely off heap and use JNI to interact with cache. We might want to ensure that the new implementation match the existing API's (ICache), and the implementation needs to have safe memory access, low overhead in memory and less memcpy's (As much as possible). We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7090) Add ability to set/get logging levels to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983308#comment-13983308 ] Vijay commented on CASSANDRA-7090: -- Yeah, the ticket is marked for 2.0 so if we need to do something for 2.0 we need a patch to make it work for slf4j. Either way works, Add ability to set/get logging levels to nodetool -- Key: CASSANDRA-7090 URL: https://issues.apache.org/jira/browse/CASSANDRA-7090 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jackson Chung Assignee: Jackson Chung Priority: Minor Fix For: 2.0.8, 2.1 beta2 Attachments: 0001-CASSANDRA-7090.patch, logging.diff, patch-7090.v20 While it is nice to use logback (per #CASSANDRA-5883) and with the autoreload feature, in some cases ops/admin may not have the permission or ability to modify the configuration file(s). Or the files are controlled by puppet/chef so it is not desirable to modify them manually. There is already an existing operation for setLoggingLevel in the StorageServuceMBean , so that's easy to expose that to the nodetool But what was lacking was ability to see the current log level settings for various loggers. The attached diff aims to do 3 things: # add JMX getLoggingLevels -- return a map of current loggers and the corresponding levels # expose both getLoggingLevels and setLoggingLevel to nodetool. In particular, the setLoggingLevel behave as follows: #* If both classQualifer and level are empty/null, it will reload the configuration to reset. #* If classQualifer is not empty but level is empty/null, it will set the level to null for the defined classQualifer #* The logback configuration should have jmxConfigurator / set The diff is based on the master branch which uses logback, soit is not applicable to 2.0 or 1.2. (2.1 is ok) Though it would be nice to have the same ability for 2.0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7090) Add ability to set/get logging levels to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983325#comment-13983325 ] Vijay commented on CASSANDRA-7090: -- Ahaa let me review it. Thanks Add ability to set/get logging levels to nodetool -- Key: CASSANDRA-7090 URL: https://issues.apache.org/jira/browse/CASSANDRA-7090 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jackson Chung Assignee: Jackson Chung Priority: Minor Fix For: 2.0.8, 2.1 beta2 Attachments: 0001-CASSANDRA-7090.patch, logging.diff, patch-7090.v20 While it is nice to use logback (per #CASSANDRA-5883) and with the autoreload feature, in some cases ops/admin may not have the permission or ability to modify the configuration file(s). Or the files are controlled by puppet/chef so it is not desirable to modify them manually. There is already an existing operation for setLoggingLevel in the StorageServuceMBean , so that's easy to expose that to the nodetool But what was lacking was ability to see the current log level settings for various loggers. The attached diff aims to do 3 things: # add JMX getLoggingLevels -- return a map of current loggers and the corresponding levels # expose both getLoggingLevels and setLoggingLevel to nodetool. In particular, the setLoggingLevel behave as follows: #* If both classQualifer and level are empty/null, it will reload the configuration to reset. #* If classQualifer is not empty but level is empty/null, it will set the level to null for the defined classQualifer #* The logback configuration should have jmxConfigurator / set The diff is based on the master branch which uses logback, soit is not applicable to 2.0 or 1.2. (2.1 is ok) Though it would be nice to have the same ability for 2.0. -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: Add ability to set/get logging levels to nodetool patch by Jackson Chung reviewed by Vijay for CASSANDRA-6751
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 daf54c5c7 - 0a20f5f17 Add ability to set/get logging levels to nodetool patch by Jackson Chung reviewed by Vijay for CASSANDRA-6751 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0a20f5f1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0a20f5f1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0a20f5f1 Branch: refs/heads/cassandra-2.0 Commit: 0a20f5f170b3596e6e74bba7daddebd4c1f5963a Parents: daf54c5 Author: Vijay vijay2...@gmail.com Authored: Mon Apr 28 21:24:07 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Mon Apr 28 21:27:58 2014 -0700 -- .../cassandra/service/StorageService.java | 41 ++-- .../cassandra/service/StorageServiceMBean.java | 2 + .../org/apache/cassandra/tools/NodeCmd.java | 29 +- .../org/apache/cassandra/tools/NodeProbe.java | 10 + .../apache/cassandra/tools/NodeToolHelp.yaml| 6 +++ 5 files changed, 84 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a20f5f1/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index 75f6427..f44eaed 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -45,6 +45,7 @@ import com.google.common.util.concurrent.Uninterruptibles; import org.apache.cassandra.cql3.CQL3Type; import org.apache.commons.lang3.StringUtils; import org.apache.log4j.Level; +import org.apache.log4j.LogManager; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -57,7 +58,6 @@ import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.config.KSMetaData; import org.apache.cassandra.config.Schema; import org.apache.cassandra.db.*; -import org.apache.cassandra.db.Keyspace; import org.apache.cassandra.db.commitlog.CommitLog; import org.apache.cassandra.db.index.SecondaryIndex; import org.apache.cassandra.dht.*; @@ -2795,9 +2795,44 @@ public class StorageService extends NotificationBroadcasterSupport implements IE public void setLog4jLevel(String classQualifier, String rawLevel) { +org.apache.log4j.Logger log4jlogger = org.apache.log4j.Logger.getLogger(classQualifier); +// if both classQualifer and rawLevel are empty, reload from configuration +if (StringUtils.isBlank(classQualifier) StringUtils.isBlank(rawLevel)) +{ +LogManager.resetConfiguration(); +CassandraDaemon.initLog4j(); +return; +} +// classQualifer is set, but blank level given +else if (StringUtils.isNotBlank(classQualifier) StringUtils.isBlank(rawLevel)) +{ +if (log4jlogger.getLevel() != null || log4jlogger.getAllAppenders().hasMoreElements()) +log4jlogger.setLevel(null); +return; +} + Level level = Level.toLevel(rawLevel); -org.apache.log4j.Logger.getLogger(classQualifier).setLevel(level); -logger.info(set log level to + level + for classes under ' + classQualifier + ' (if the level doesn't look like ' + rawLevel + ' then log4j couldn't parse ' + rawLevel + ')); +log4jlogger.setLevel(level); +logger.info(set log level to {} for classes under '{}' (if the level doesn't look like '{}' then the logger couldn't parse '{}'), level, classQualifier, rawLevel, rawLevel); +} + +/** + * @return the runtime logging levels for all the configured loggers + */ +@Override +public MapString,String getLoggingLevels() +{ +MapString, String logLevelMaps = Maps.newLinkedHashMap(); +org.apache.log4j.Logger rootLogger = org.apache.log4j.Logger.getRootLogger(); +logLevelMaps.put(rootLogger.getName(), rootLogger.getLevel().toString()); +Enumerationorg.apache.log4j.Logger loggers = LogManager.getCurrentLoggers(); +while (loggers.hasMoreElements()) +{ +org.apache.log4j.Logger logger= loggers.nextElement(); +if (logger.getLevel() != null) +logLevelMaps.put(logger.getName(), logger.getLevel().toString()); +} +return logLevelMaps; } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a20f5f1/src/java/org/apache/cassandra/service/StorageServiceMBean.java -- diff --git a/src/java/org/apache/cassandra/service/StorageServiceMBean.java b/src/java/org/apache/cassandra/service
[3/3] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Conflicts: src/java/org/apache/cassandra/service/StorageService.java src/java/org/apache/cassandra/tools/NodeCmd.java src/resources/org/apache/cassandra/tools/NodeToolHelp.yaml Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d402cf68 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d402cf68 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d402cf68 Branch: refs/heads/cassandra-2.1 Commit: d402cf687001390cf81307e2d9d45a87b24f5837 Parents: 24eeeb9 0a20f5f Author: Vijay vijay2...@gmail.com Authored: Mon Apr 28 21:38:22 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Mon Apr 28 21:38:22 2014 -0700 -- conf/logback.xml| 2 +- .../cassandra/service/StorageService.java | 49 ++-- .../cassandra/service/StorageServiceMBean.java | 21 - .../org/apache/cassandra/tools/NodeProbe.java | 17 +++ .../org/apache/cassandra/tools/NodeTool.java| 32 - 5 files changed, 114 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d402cf68/conf/logback.xml -- diff --cc conf/logback.xml index 655bcf6,000..2657174 mode 100644,00..100644 --- a/conf/logback.xml +++ b/conf/logback.xml @@@ -1,34 -1,0 +1,34 @@@ +configuration scan=true - ++ jmxConfigurator / + appender name=FILE class=ch.qos.logback.core.rolling.RollingFileAppender +file/var/log/cassandra/system.log/file +rollingPolicy class=ch.qos.logback.core.rolling.FixedWindowRollingPolicy + fileNamePattern/var/log/cassandra/system.log.%i.zip/fileNamePattern + minIndex1/minIndex + maxIndex20/maxIndex +/rollingPolicy + +triggeringPolicy class=ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy + maxFileSize20MB/maxFileSize +/triggeringPolicy +encoder + pattern%-5level [%thread] %date{ISO8601} %F:%L - %msg%n/pattern + !-- old-style log format + pattern%5level [%thread] %date{ISO8601} %F (line %L) %msg%n/pattern + -- +/encoder + /appender + + appender name=STDOUT class=ch.qos.logback.core.ConsoleAppender +encoder + pattern%-5level %date{HH:mm:ss,SSS} %msg%n/pattern +/encoder + /appender + + root level=INFO +appender-ref ref=FILE / +appender-ref ref=STDOUT / + /root + + logger name=com.thinkaurelius.thrift level=ERROR/ +/configuration http://git-wip-us.apache.org/repos/asf/cassandra/blob/d402cf68/src/java/org/apache/cassandra/service/StorageService.java -- diff --cc src/java/org/apache/cassandra/service/StorageService.java index a9c0233,f44eaed..a284ab4 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@@ -29,14 -29,11 +29,20 @@@ import java.util.* import java.util.concurrent.*; import java.util.concurrent.atomic.AtomicInteger; import java.util.concurrent.atomic.AtomicLong; + ++import javax.management.JMX; import javax.management.MBeanServer; import javax.management.Notification; import javax.management.NotificationBroadcasterSupport; import javax.management.ObjectName; +import javax.management.openmbean.TabularData; +import javax.management.openmbean.TabularDataSupport; + ++import ch.qos.logback.classic.LoggerContext; ++import ch.qos.logback.classic.jmx.JMXConfiguratorMBean; ++import ch.qos.logback.classic.spi.ILoggingEvent; ++import ch.qos.logback.core.Appender; + import com.google.common.annotations.VisibleForTesting; import com.google.common.base.Predicate; import com.google.common.collect.*; @@@ -44,9 -42,10 +50,8 @@@ import com.google.common.util.concurren import com.google.common.util.concurrent.Futures; import com.google.common.util.concurrent.Uninterruptibles; --import org.apache.cassandra.cql3.CQL3Type; import org.apache.commons.lang3.StringUtils; -import org.apache.log4j.Level; -import org.apache.log4j.LogManager; + import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@@ -2874,15 -2793,49 +2879,53 @@@ public class StorageService extends Not return liveEps; } - public void setLoggingLevel(String classQualifier, String rawLevel) -public void setLog4jLevel(String classQualifier, String rawLevel) ++public void setLoggingLevel(String classQualifier, String rawLevel) throws Exception { -org.apache.log4j.Logger log4jlogger = org.apache.log4j.Logger.getLogger(classQualifier); +ch.qos.logback.classic.Logger logBackLogger = (ch.qos.logback.classic.Logger
[1/3] git commit: Add ability to set/get logging levels to nodetool patch by Jackson Chung reviewed by Vijay for CASSANDRA-6751
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 24eeeb91d - d402cf687 Add ability to set/get logging levels to nodetool patch by Jackson Chung reviewed by Vijay for CASSANDRA-6751 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0a20f5f1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0a20f5f1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0a20f5f1 Branch: refs/heads/cassandra-2.1 Commit: 0a20f5f170b3596e6e74bba7daddebd4c1f5963a Parents: daf54c5 Author: Vijay vijay2...@gmail.com Authored: Mon Apr 28 21:24:07 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Mon Apr 28 21:27:58 2014 -0700 -- .../cassandra/service/StorageService.java | 41 ++-- .../cassandra/service/StorageServiceMBean.java | 2 + .../org/apache/cassandra/tools/NodeCmd.java | 29 +- .../org/apache/cassandra/tools/NodeProbe.java | 10 + .../apache/cassandra/tools/NodeToolHelp.yaml| 6 +++ 5 files changed, 84 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a20f5f1/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index 75f6427..f44eaed 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -45,6 +45,7 @@ import com.google.common.util.concurrent.Uninterruptibles; import org.apache.cassandra.cql3.CQL3Type; import org.apache.commons.lang3.StringUtils; import org.apache.log4j.Level; +import org.apache.log4j.LogManager; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -57,7 +58,6 @@ import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.config.KSMetaData; import org.apache.cassandra.config.Schema; import org.apache.cassandra.db.*; -import org.apache.cassandra.db.Keyspace; import org.apache.cassandra.db.commitlog.CommitLog; import org.apache.cassandra.db.index.SecondaryIndex; import org.apache.cassandra.dht.*; @@ -2795,9 +2795,44 @@ public class StorageService extends NotificationBroadcasterSupport implements IE public void setLog4jLevel(String classQualifier, String rawLevel) { +org.apache.log4j.Logger log4jlogger = org.apache.log4j.Logger.getLogger(classQualifier); +// if both classQualifer and rawLevel are empty, reload from configuration +if (StringUtils.isBlank(classQualifier) StringUtils.isBlank(rawLevel)) +{ +LogManager.resetConfiguration(); +CassandraDaemon.initLog4j(); +return; +} +// classQualifer is set, but blank level given +else if (StringUtils.isNotBlank(classQualifier) StringUtils.isBlank(rawLevel)) +{ +if (log4jlogger.getLevel() != null || log4jlogger.getAllAppenders().hasMoreElements()) +log4jlogger.setLevel(null); +return; +} + Level level = Level.toLevel(rawLevel); -org.apache.log4j.Logger.getLogger(classQualifier).setLevel(level); -logger.info(set log level to + level + for classes under ' + classQualifier + ' (if the level doesn't look like ' + rawLevel + ' then log4j couldn't parse ' + rawLevel + ')); +log4jlogger.setLevel(level); +logger.info(set log level to {} for classes under '{}' (if the level doesn't look like '{}' then the logger couldn't parse '{}'), level, classQualifier, rawLevel, rawLevel); +} + +/** + * @return the runtime logging levels for all the configured loggers + */ +@Override +public MapString,String getLoggingLevels() +{ +MapString, String logLevelMaps = Maps.newLinkedHashMap(); +org.apache.log4j.Logger rootLogger = org.apache.log4j.Logger.getRootLogger(); +logLevelMaps.put(rootLogger.getName(), rootLogger.getLevel().toString()); +Enumerationorg.apache.log4j.Logger loggers = LogManager.getCurrentLoggers(); +while (loggers.hasMoreElements()) +{ +org.apache.log4j.Logger logger= loggers.nextElement(); +if (logger.getLevel() != null) +logLevelMaps.put(logger.getName(), logger.getLevel().toString()); +} +return logLevelMaps; } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a20f5f1/src/java/org/apache/cassandra/service/StorageServiceMBean.java -- diff --git a/src/java/org/apache/cassandra/service/StorageServiceMBean.java b/src/java/org/apache/cassandra/service
[1/4] git commit: Add ability to set/get logging levels to nodetool patch by Jackson Chung reviewed by Vijay for CASSANDRA-6751
Repository: cassandra Updated Branches: refs/heads/trunk afe2e1407 - b2ef56478 Add ability to set/get logging levels to nodetool patch by Jackson Chung reviewed by Vijay for CASSANDRA-6751 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0a20f5f1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0a20f5f1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0a20f5f1 Branch: refs/heads/trunk Commit: 0a20f5f170b3596e6e74bba7daddebd4c1f5963a Parents: daf54c5 Author: Vijay vijay2...@gmail.com Authored: Mon Apr 28 21:24:07 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Mon Apr 28 21:27:58 2014 -0700 -- .../cassandra/service/StorageService.java | 41 ++-- .../cassandra/service/StorageServiceMBean.java | 2 + .../org/apache/cassandra/tools/NodeCmd.java | 29 +- .../org/apache/cassandra/tools/NodeProbe.java | 10 + .../apache/cassandra/tools/NodeToolHelp.yaml| 6 +++ 5 files changed, 84 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a20f5f1/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index 75f6427..f44eaed 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -45,6 +45,7 @@ import com.google.common.util.concurrent.Uninterruptibles; import org.apache.cassandra.cql3.CQL3Type; import org.apache.commons.lang3.StringUtils; import org.apache.log4j.Level; +import org.apache.log4j.LogManager; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -57,7 +58,6 @@ import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.config.KSMetaData; import org.apache.cassandra.config.Schema; import org.apache.cassandra.db.*; -import org.apache.cassandra.db.Keyspace; import org.apache.cassandra.db.commitlog.CommitLog; import org.apache.cassandra.db.index.SecondaryIndex; import org.apache.cassandra.dht.*; @@ -2795,9 +2795,44 @@ public class StorageService extends NotificationBroadcasterSupport implements IE public void setLog4jLevel(String classQualifier, String rawLevel) { +org.apache.log4j.Logger log4jlogger = org.apache.log4j.Logger.getLogger(classQualifier); +// if both classQualifer and rawLevel are empty, reload from configuration +if (StringUtils.isBlank(classQualifier) StringUtils.isBlank(rawLevel)) +{ +LogManager.resetConfiguration(); +CassandraDaemon.initLog4j(); +return; +} +// classQualifer is set, but blank level given +else if (StringUtils.isNotBlank(classQualifier) StringUtils.isBlank(rawLevel)) +{ +if (log4jlogger.getLevel() != null || log4jlogger.getAllAppenders().hasMoreElements()) +log4jlogger.setLevel(null); +return; +} + Level level = Level.toLevel(rawLevel); -org.apache.log4j.Logger.getLogger(classQualifier).setLevel(level); -logger.info(set log level to + level + for classes under ' + classQualifier + ' (if the level doesn't look like ' + rawLevel + ' then log4j couldn't parse ' + rawLevel + ')); +log4jlogger.setLevel(level); +logger.info(set log level to {} for classes under '{}' (if the level doesn't look like '{}' then the logger couldn't parse '{}'), level, classQualifier, rawLevel, rawLevel); +} + +/** + * @return the runtime logging levels for all the configured loggers + */ +@Override +public MapString,String getLoggingLevels() +{ +MapString, String logLevelMaps = Maps.newLinkedHashMap(); +org.apache.log4j.Logger rootLogger = org.apache.log4j.Logger.getRootLogger(); +logLevelMaps.put(rootLogger.getName(), rootLogger.getLevel().toString()); +Enumerationorg.apache.log4j.Logger loggers = LogManager.getCurrentLoggers(); +while (loggers.hasMoreElements()) +{ +org.apache.log4j.Logger logger= loggers.nextElement(); +if (logger.getLevel() != null) +logLevelMaps.put(logger.getName(), logger.getLevel().toString()); +} +return logLevelMaps; } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a20f5f1/src/java/org/apache/cassandra/service/StorageServiceMBean.java -- diff --git a/src/java/org/apache/cassandra/service/StorageServiceMBean.java b/src/java/org/apache/cassandra/service/StorageServiceMBean.java index
[4/4] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b2ef5647 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b2ef5647 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b2ef5647 Branch: refs/heads/trunk Commit: b2ef56478d5be36d5e20932428698ed930b9b992 Parents: afe2e14 d402cf6 Author: Vijay vijay2...@gmail.com Authored: Mon Apr 28 21:39:20 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Mon Apr 28 21:39:20 2014 -0700 -- conf/logback.xml| 2 +- .../cassandra/service/StorageService.java | 49 ++-- .../cassandra/service/StorageServiceMBean.java | 21 - .../org/apache/cassandra/tools/NodeProbe.java | 17 +++ .../org/apache/cassandra/tools/NodeTool.java| 32 - 5 files changed, 114 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b2ef5647/src/java/org/apache/cassandra/service/StorageService.java --
[3/4] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Conflicts: src/java/org/apache/cassandra/service/StorageService.java src/java/org/apache/cassandra/tools/NodeCmd.java src/resources/org/apache/cassandra/tools/NodeToolHelp.yaml Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d402cf68 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d402cf68 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d402cf68 Branch: refs/heads/trunk Commit: d402cf687001390cf81307e2d9d45a87b24f5837 Parents: 24eeeb9 0a20f5f Author: Vijay vijay2...@gmail.com Authored: Mon Apr 28 21:38:22 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Mon Apr 28 21:38:22 2014 -0700 -- conf/logback.xml| 2 +- .../cassandra/service/StorageService.java | 49 ++-- .../cassandra/service/StorageServiceMBean.java | 21 - .../org/apache/cassandra/tools/NodeProbe.java | 17 +++ .../org/apache/cassandra/tools/NodeTool.java| 32 - 5 files changed, 114 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d402cf68/conf/logback.xml -- diff --cc conf/logback.xml index 655bcf6,000..2657174 mode 100644,00..100644 --- a/conf/logback.xml +++ b/conf/logback.xml @@@ -1,34 -1,0 +1,34 @@@ +configuration scan=true - ++ jmxConfigurator / + appender name=FILE class=ch.qos.logback.core.rolling.RollingFileAppender +file/var/log/cassandra/system.log/file +rollingPolicy class=ch.qos.logback.core.rolling.FixedWindowRollingPolicy + fileNamePattern/var/log/cassandra/system.log.%i.zip/fileNamePattern + minIndex1/minIndex + maxIndex20/maxIndex +/rollingPolicy + +triggeringPolicy class=ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy + maxFileSize20MB/maxFileSize +/triggeringPolicy +encoder + pattern%-5level [%thread] %date{ISO8601} %F:%L - %msg%n/pattern + !-- old-style log format + pattern%5level [%thread] %date{ISO8601} %F (line %L) %msg%n/pattern + -- +/encoder + /appender + + appender name=STDOUT class=ch.qos.logback.core.ConsoleAppender +encoder + pattern%-5level %date{HH:mm:ss,SSS} %msg%n/pattern +/encoder + /appender + + root level=INFO +appender-ref ref=FILE / +appender-ref ref=STDOUT / + /root + + logger name=com.thinkaurelius.thrift level=ERROR/ +/configuration http://git-wip-us.apache.org/repos/asf/cassandra/blob/d402cf68/src/java/org/apache/cassandra/service/StorageService.java -- diff --cc src/java/org/apache/cassandra/service/StorageService.java index a9c0233,f44eaed..a284ab4 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@@ -29,14 -29,11 +29,20 @@@ import java.util.* import java.util.concurrent.*; import java.util.concurrent.atomic.AtomicInteger; import java.util.concurrent.atomic.AtomicLong; + ++import javax.management.JMX; import javax.management.MBeanServer; import javax.management.Notification; import javax.management.NotificationBroadcasterSupport; import javax.management.ObjectName; +import javax.management.openmbean.TabularData; +import javax.management.openmbean.TabularDataSupport; + ++import ch.qos.logback.classic.LoggerContext; ++import ch.qos.logback.classic.jmx.JMXConfiguratorMBean; ++import ch.qos.logback.classic.spi.ILoggingEvent; ++import ch.qos.logback.core.Appender; + import com.google.common.annotations.VisibleForTesting; import com.google.common.base.Predicate; import com.google.common.collect.*; @@@ -44,9 -42,10 +50,8 @@@ import com.google.common.util.concurren import com.google.common.util.concurrent.Futures; import com.google.common.util.concurrent.Uninterruptibles; --import org.apache.cassandra.cql3.CQL3Type; import org.apache.commons.lang3.StringUtils; -import org.apache.log4j.Level; -import org.apache.log4j.LogManager; + import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@@ -2874,15 -2793,49 +2879,53 @@@ public class StorageService extends Not return liveEps; } - public void setLoggingLevel(String classQualifier, String rawLevel) -public void setLog4jLevel(String classQualifier, String rawLevel) ++public void setLoggingLevel(String classQualifier, String rawLevel) throws Exception { -org.apache.log4j.Logger log4jlogger = org.apache.log4j.Logger.getLogger(classQualifier); +ch.qos.logback.classic.Logger logBackLogger = (ch.qos.logback.classic.Logger) LoggerFactory.getLogger
[jira] [Updated] (CASSANDRA-7090) Add ability to set/get logging levels to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-7090: - Attachment: 0001-CASSANDRA-7090.patch Hi Jackson, Attached removes the changes to the imports, formatting changes to match Cassandra's format, removed changes to log location... Can you rebase to 2.0? Add ability to set/get logging levels to nodetool -- Key: CASSANDRA-7090 URL: https://issues.apache.org/jira/browse/CASSANDRA-7090 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jackson Chung Assignee: Jackson Chung Priority: Minor Fix For: 2.0.8, 2.1 beta2 Attachments: 0001-CASSANDRA-7090.patch, logging.diff, patch-7090.v20 While it is nice to use logback (per #CASSANDRA-5883) and with the autoreload feature, in some cases ops/admin may not have the permission or ability to modify the configuration file(s). Or the files are controlled by puppet/chef so it is not desirable to modify them manually. There is already an existing operation for setLoggingLevel in the StorageServuceMBean , so that's easy to expose that to the nodetool But what was lacking was ability to see the current log level settings for various loggers. The attached diff aims to do 3 things: # add JMX getLoggingLevels -- return a map of current loggers and the corresponding levels # expose both getLoggingLevels and setLoggingLevel to nodetool. In particular, the setLoggingLevel behave as follows: #* If both classQualifer and level are empty/null, it will reload the configuration to reset. #* If classQualifer is not empty but level is empty/null, it will set the level to null for the defined classQualifer #* The logback configuration should have jmxConfigurator / set The diff is based on the master branch which uses logback, soit is not applicable to 2.0 or 1.2. (2.1 is ok) Though it would be nice to have the same ability for 2.0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7082) Nodetool status always displays the first token instead of the number of vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981672#comment-13981672 ] Vijay commented on CASSANDRA-7082: -- +1 Thanks! Nodetool status always displays the first token instead of the number of vnodes --- Key: CASSANDRA-7082 URL: https://issues.apache.org/jira/browse/CASSANDRA-7082 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Jivko Donev Assignee: Brandon Williams Priority: Minor Labels: nodetool Fix For: 1.2.17, 2.0.8 Attachments: 7082.txt nodetool status command always displays the first token for a node even if using vnodes. The defect is only reproduced on version 2.0.7. With the same configuration 2.0.7 displays: Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns (effective) Host ID TokenRack UN 127.0.0.1 156.34 KB 100.0% d6629553-d6e9-434d-bf01-54c257b20ea9 -9134643033027010921 Rack1 But 2.0.6 displays: Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID UN 127.0.0.1 210.32 KB 256 100.0% 08208ec9-8976-4ad0-b6bb-ee5dcf0109e The problem seems to be in NodeCmd.java the check for vnodes. In the print() method there is a check // More tokens then nodes (aka vnodes)? if (tokensToEndpoints.values().size() tokensToEndpoints.keySet().size()) isTokenPerNode = false; while in 2.0.6 the same code was: // More tokens then nodes (aka vnodes)? if (new HashSetString(tokensToEndpoints.values()).size() tokensToEndpoints.keySet().size()) isTokenPerNode = false; In 2.0.7 this check is never true as values collection is always equal by size with key set size. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977788#comment-13977788 ] Vijay commented on CASSANDRA-7030: -- Hi Bendict, Sorry missed the update earlier Not sure why we are comparing synchronization, hence i removed synchronization and here are the results on RHEL (32 core box) http://pastebin.com/ZXSytn70. JEMalloc with JNI overhead is faster and efficient. Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt, benchmark.21.diff.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: Setting severity via JMX broken patch by Vijay; reviewed by rbranson for CASSANDRA-6996
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 b9324e1b9 - 4e4d7bbcb Setting severity via JMX broken patch by Vijay; reviewed by rbranson for CASSANDRA-6996 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e4d7bbc Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e4d7bbc Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e4d7bbc Branch: refs/heads/cassandra-2.0 Commit: 4e4d7bbcb254285a1031cb232b3fe7af326e9da3 Parents: b9324e1 Author: Vijay vijay2...@gmail.com Authored: Tue Apr 22 21:30:38 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Tue Apr 22 21:30:38 2014 -0700 -- .../org/apache/cassandra/locator/DynamicEndpointSnitch.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 5 + .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++ 3 files changed, 13 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java -- diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java index 00c3618..c76a196 100644 --- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java +++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java @@ -300,7 +300,7 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa public void setSeverity(double severity) { -StorageService.instance.reportSeverity(severity); +StorageService.instance.reportManualSeverity(severity); } public double getSeverity() http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index 7382cbd..75f6427 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -1054,6 +1054,11 @@ public class StorageService extends NotificationBroadcasterSupport implements IE bgMonitor.incrCompactionSeverity(incr); } +public void reportManualSeverity(double incr) +{ +bgMonitor.incrManualSeverity(incr); +} + public double getSeverity(InetAddress endpoint) { return bgMonitor.getSeverity(endpoint); http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java -- diff --git a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java index bad9a17..93906eb 100644 --- a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java +++ b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java @@ -56,6 +56,7 @@ public class BackgroundActivityMonitor private static final String PROC_STAT_PATH = /proc/stat; private final AtomicDouble compaction_severity = new AtomicDouble(); +private final AtomicDouble manual_severity = new AtomicDouble(); private final ScheduledExecutorService reportThread = new DebuggableScheduledThreadPoolExecutor(Background_Reporter); private RandomAccessFile statsFile; @@ -112,6 +113,11 @@ public class BackgroundActivityMonitor compaction_severity.addAndGet(sev); } +public void incrManualSeverity(double sev) +{ +manual_severity.addAndGet(sev); +} + public double getIOWait() throws IOException { if (statsFile == null) @@ -157,6 +163,7 @@ public class BackgroundActivityMonitor if (!Gossiper.instance.isEnabled()) return; +report += manual_severity.get(); // add manual severity setting. VersionedValue updated = StorageService.instance.valueFactory.severity(report); Gossiper.instance.addLocalApplicationState(ApplicationState.SEVERITY, updated); }
[1/2] git commit: Setting severity via JMX broken patch by Vijay; reviewed by rbranson for CASSANDRA-6996
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 2c7622a65 - ad57cb010 Setting severity via JMX broken patch by Vijay; reviewed by rbranson for CASSANDRA-6996 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e4d7bbc Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e4d7bbc Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e4d7bbc Branch: refs/heads/cassandra-2.1 Commit: 4e4d7bbcb254285a1031cb232b3fe7af326e9da3 Parents: b9324e1 Author: Vijay vijay2...@gmail.com Authored: Tue Apr 22 21:30:38 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Tue Apr 22 21:30:38 2014 -0700 -- .../org/apache/cassandra/locator/DynamicEndpointSnitch.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 5 + .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++ 3 files changed, 13 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java -- diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java index 00c3618..c76a196 100644 --- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java +++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java @@ -300,7 +300,7 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa public void setSeverity(double severity) { -StorageService.instance.reportSeverity(severity); +StorageService.instance.reportManualSeverity(severity); } public double getSeverity() http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index 7382cbd..75f6427 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -1054,6 +1054,11 @@ public class StorageService extends NotificationBroadcasterSupport implements IE bgMonitor.incrCompactionSeverity(incr); } +public void reportManualSeverity(double incr) +{ +bgMonitor.incrManualSeverity(incr); +} + public double getSeverity(InetAddress endpoint) { return bgMonitor.getSeverity(endpoint); http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java -- diff --git a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java index bad9a17..93906eb 100644 --- a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java +++ b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java @@ -56,6 +56,7 @@ public class BackgroundActivityMonitor private static final String PROC_STAT_PATH = /proc/stat; private final AtomicDouble compaction_severity = new AtomicDouble(); +private final AtomicDouble manual_severity = new AtomicDouble(); private final ScheduledExecutorService reportThread = new DebuggableScheduledThreadPoolExecutor(Background_Reporter); private RandomAccessFile statsFile; @@ -112,6 +113,11 @@ public class BackgroundActivityMonitor compaction_severity.addAndGet(sev); } +public void incrManualSeverity(double sev) +{ +manual_severity.addAndGet(sev); +} + public double getIOWait() throws IOException { if (statsFile == null) @@ -157,6 +163,7 @@ public class BackgroundActivityMonitor if (!Gossiper.instance.isEnabled()) return; +report += manual_severity.get(); // add manual severity setting. VersionedValue updated = StorageService.instance.valueFactory.severity(report); Gossiper.instance.addLocalApplicationState(ApplicationState.SEVERITY, updated); }
[2/2] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ad57cb01 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ad57cb01 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ad57cb01 Branch: refs/heads/cassandra-2.1 Commit: ad57cb010231a89d8795a0944bd99eb6e72079cc Parents: 2c7622a 4e4d7bb Author: Vijay vijay2...@gmail.com Authored: Tue Apr 22 21:34:22 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Tue Apr 22 21:34:22 2014 -0700 -- .../org/apache/cassandra/locator/DynamicEndpointSnitch.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 5 + .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++ 3 files changed, 13 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad57cb01/src/java/org/apache/cassandra/service/StorageService.java --
[1/3] git commit: Setting severity via JMX broken patch by Vijay; reviewed by rbranson for CASSANDRA-6996
Repository: cassandra Updated Branches: refs/heads/trunk 99fbafee3 - 902925716 Setting severity via JMX broken patch by Vijay; reviewed by rbranson for CASSANDRA-6996 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e4d7bbc Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e4d7bbc Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e4d7bbc Branch: refs/heads/trunk Commit: 4e4d7bbcb254285a1031cb232b3fe7af326e9da3 Parents: b9324e1 Author: Vijay vijay2...@gmail.com Authored: Tue Apr 22 21:30:38 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Tue Apr 22 21:30:38 2014 -0700 -- .../org/apache/cassandra/locator/DynamicEndpointSnitch.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 5 + .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++ 3 files changed, 13 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java -- diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java index 00c3618..c76a196 100644 --- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java +++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java @@ -300,7 +300,7 @@ public class DynamicEndpointSnitch extends AbstractEndpointSnitch implements ILa public void setSeverity(double severity) { -StorageService.instance.reportSeverity(severity); +StorageService.instance.reportManualSeverity(severity); } public double getSeverity() http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index 7382cbd..75f6427 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -1054,6 +1054,11 @@ public class StorageService extends NotificationBroadcasterSupport implements IE bgMonitor.incrCompactionSeverity(incr); } +public void reportManualSeverity(double incr) +{ +bgMonitor.incrManualSeverity(incr); +} + public double getSeverity(InetAddress endpoint) { return bgMonitor.getSeverity(endpoint); http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e4d7bbc/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java -- diff --git a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java index bad9a17..93906eb 100644 --- a/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java +++ b/src/java/org/apache/cassandra/utils/BackgroundActivityMonitor.java @@ -56,6 +56,7 @@ public class BackgroundActivityMonitor private static final String PROC_STAT_PATH = /proc/stat; private final AtomicDouble compaction_severity = new AtomicDouble(); +private final AtomicDouble manual_severity = new AtomicDouble(); private final ScheduledExecutorService reportThread = new DebuggableScheduledThreadPoolExecutor(Background_Reporter); private RandomAccessFile statsFile; @@ -112,6 +113,11 @@ public class BackgroundActivityMonitor compaction_severity.addAndGet(sev); } +public void incrManualSeverity(double sev) +{ +manual_severity.addAndGet(sev); +} + public double getIOWait() throws IOException { if (statsFile == null) @@ -157,6 +163,7 @@ public class BackgroundActivityMonitor if (!Gossiper.instance.isEnabled()) return; +report += manual_severity.get(); // add manual severity setting. VersionedValue updated = StorageService.instance.valueFactory.severity(report); Gossiper.instance.addLocalApplicationState(ApplicationState.SEVERITY, updated); }
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/90292571 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/90292571 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/90292571 Branch: refs/heads/trunk Commit: 90292571663e7ba8cf9b625b0d89fd67ae1bcc3e Parents: 99fbafe ad57cb0 Author: Vijay vijay2...@gmail.com Authored: Tue Apr 22 21:36:20 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Tue Apr 22 21:36:20 2014 -0700 -- .../org/apache/cassandra/locator/DynamicEndpointSnitch.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 5 + .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++ 3 files changed, 13 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/90292571/src/java/org/apache/cassandra/service/StorageService.java --
[2/3] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ad57cb01 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ad57cb01 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ad57cb01 Branch: refs/heads/trunk Commit: ad57cb010231a89d8795a0944bd99eb6e72079cc Parents: 2c7622a 4e4d7bb Author: Vijay vijay2...@gmail.com Authored: Tue Apr 22 21:34:22 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Tue Apr 22 21:34:22 2014 -0700 -- .../org/apache/cassandra/locator/DynamicEndpointSnitch.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 5 + .../org/apache/cassandra/utils/BackgroundActivityMonitor.java | 7 +++ 3 files changed, 13 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad57cb01/src/java/org/apache/cassandra/service/StorageService.java --
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970297#comment-13970297 ] Vijay commented on CASSANDRA-7030: -- You are right i had the synchronization in the test attached in the old ticket because initially we had some segfaults which was fixed in the later JEM releases, but the synchronization was never committed into cassandra repo because by then it was fixed. Rerunning the test after removing the locks in the same old test classes, the results the time take is much better in jemalloc, you might need more runs. The memory foot print is better too (malloc is slower and uses more memory comparatively as per my tests). http://pastebin.com/JtixVvGU As mentioned earlier i don't mind removing it either :) Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6974) Replaying archived commitlogs isn't working
[ https://issues.apache.org/jira/browse/CASSANDRA-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968508#comment-13968508 ] Vijay commented on CASSANDRA-6974: -- Hi Bendict, {quote} Not only does this mean logs are replayed out of order {quote} I don't think we guarantee that the CL replay will be in order, with Cassandra we don't need to isn't it? May be i am missing something. With re use of the segments it is hard to sequence it based on the names... encoding an uniq id which increments all the time is what we are talking about? Replaying archived commitlogs isn't working --- Key: CASSANDRA-6974 URL: https://issues.apache.org/jira/browse/CASSANDRA-6974 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Assignee: Benedict Fix For: 2.1 beta2 Attachments: 2.0.system.log, 2.1.system.log I have a test for restoring archived commitlogs, which is not working in 2.1 HEAD. My commitlogs consist of 30,000 inserts, but system.log indicates there were only 2 mutations replayed: {code} INFO [main] 2014-04-02 11:49:54,173 CommitLog.java:115 - Log replay complete, 2 replayed mutations {code} There are several warnings in the logs about bad headers and invalid CRCs: {code} WARN [main] 2014-04-02 11:49:54,156 CommitLogReplayer.java:138 - Encountered bad header at position 0 of commit log /tmp/dtest -mZIlPE/test/node1/commitlogs/CommitLog-4-1396453793570.log, with invalid CRC. The end of segment marker should be zero. {code} compare that to the same test run on 2.0, where it replayed many more mutations: {code} INFO [main] 2014-04-02 11:49:04,673 CommitLog.java (line 132) Log replay complete, 35960 replayed mutations {code} I'll attach the system logs for reference. [Here is the dtest to reproduce this|https://github.com/riptano/cassandra-dtest/blob/master/snapshot_test.py#L75] - (This currently relies on the fix for snapshots available in CASSANDRA-6965.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968515#comment-13968515 ] Vijay commented on CASSANDRA-7030: -- Hi Bendict, Interesting how slow is slow in terms of Cassandra's throughput/latencies? Isn't it tradeoff between memory fragmentation (use) vs speed? here is the test for memory footprint https://issues.apache.org/jira/browse/CASSANDRA-3997?focusedCommentId=13243924page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13243924 Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968847#comment-13968847 ] Vijay commented on CASSANDRA-7030: -- JEMalloc will be better if you have different sized objects (if of same size then there isn't much fragmentation), the benchmark in the other ticket uses 5 200 which will provide you a distribution which can be compared... I don't think the test you are proposing is valid... Anyways i don't really have any bias on this ticket as we don't use this in production and thinking on JNI alternatives for Serialized cache anyways... Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968847#comment-13968847 ] Vijay edited comment on CASSANDRA-7030 at 4/14/14 9:17 PM: --- JEMalloc will be better if you have different sized objects (if of same size then there isn't much fragmentation), the benchmark in the other ticket uses 5 200 which will provide you a distribution which can be compared... I don't think the test you are proposing is in the same lines... Anyways i don't really have any bias on this ticket as we don't use this in production and thinking on JNI alternatives for Serialized cache anyways... was (Author: vijay2...@yahoo.com): JEMalloc will be better if you have different sized objects (if of same size then there isn't much fragmentation), the benchmark in the other ticket uses 5 200 which will provide you a distribution which can be compared... I don't think the test you are proposing is valid... Anyways i don't really have any bias on this ticket as we don't use this in production and thinking on JNI alternatives for Serialized cache anyways... Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968960#comment-13968960 ] Vijay commented on CASSANDRA-7030: -- Not enough iterations i guess Just ran the old tool again (to verify if i am still making sense :) ) and i do get the same kind of results (http://pastebin.com/LPFUutaY)... Things might be different like the kernel version etc Also there is no difference in the allocator except the malloc is over written to do threaded allocation you can see bench marks inline too. Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968960#comment-13968960 ] Vijay edited comment on CASSANDRA-7030 at 4/14/14 10:52 PM: Not enough iterations i guess Just ran the old tool again (to verify if i am still making sense :) ) and i do get the same kind of results (http://pastebin.com/LPFUutaY)... Things might be different like the kernel version etc Also there is no difference in the allocator except the malloc is over written to do threaded allocation you can see bench marks online too. was (Author: vijay2...@yahoo.com): Not enough iterations i guess Just ran the old tool again (to verify if i am still making sense :) ) and i do get the same kind of results (http://pastebin.com/LPFUutaY)... Things might be different like the kernel version etc Also there is no difference in the allocator except the malloc is over written to do threaded allocation you can see bench marks inline too. Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6996) Setting severity via JMX broken
[ https://issues.apache.org/jira/browse/CASSANDRA-6996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-6996: - Attachment: 0001-CASSANDRA-6996.patch The problem is that severity now based on the IOWait (if unix based), but the DES sets the compaction severity which is ignored. Attached patch adds manual_severity variable to override both... Thanks! Setting severity via JMX broken --- Key: CASSANDRA-6996 URL: https://issues.apache.org/jira/browse/CASSANDRA-6996 Project: Cassandra Issue Type: Bug Reporter: Rick Branson Assignee: Vijay Priority: Minor Attachments: 0001-CASSANDRA-6996.patch Looks like setting the Severity attribute in the DynamicEndpointSnitch via JMX is a no-op. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6995) Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to read stage
[ https://issues.apache.org/jira/browse/CASSANDRA-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963177#comment-13963177 ] Vijay commented on CASSANDRA-6995: -- Just my 2 cents: I like the idea of dispatching to a separate TPE directly from thrift/native removing the need to switch ctx, how about continuing the work in CASSANDRA-5239 and getting rid of io threads? (In addition we can use the timeout thread to speculate and making storage proxy completely async post 5239?)... In a separate note shouldn't we throttle on the number of disk read from the disk instead of concurrent_writers and reads? its un fair on the threads when one request pulls a lot more data than others Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to read stage Key: CASSANDRA-6995 URL: https://issues.apache.org/jira/browse/CASSANDRA-6995 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jason Brown Assignee: Jason Brown Priority: Minor Labels: performance Fix For: 2.0.7 Attachments: 6995-v1.diff, syncread-stress.txt When performing a read local to a coordinator node, AbstractReadExecutor will create a new SP.LocalReadRunnable and drop it into the read stage for asynchronous execution. If you are using a client that intelligently routes read requests to a node holding the data for a given request, and are using CL.ONE/LOCAL_ONE, the enqueuing SP.LocalReadRunnable and waiting for the context switches (and possible NUMA misses) adds unneccesary latency. We can reduce that latency and improve throughput by avoiding the queueing and thread context switching by simply executing the SP.LocalReadRunnable synchronously in the request thread. Testing on a three node cluster (each with 32 cpus, 132 GB ram) yields ~10% improvement in throughput and ~20% speedup on avg/95/99 percentiles (99.9% was about 5-10% improvement). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6995) Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to read stage
[ https://issues.apache.org/jira/browse/CASSANDRA-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963177#comment-13963177 ] Vijay edited comment on CASSANDRA-6995 at 4/8/14 5:05 PM: -- Just my 2 cents: I like the idea of dispatching to a separate Stages directly from thrift/native removing the need to switch ctx, how about continuing the work in CASSANDRA-5239 and getting rid of io threads? (In addition we can use the timeout thread to speculate and making storage proxy completely async post 5239?)... In a separate note shouldn't we throttle on the number of disk read from the disk instead of concurrent_writers and reads? its un fair on the threads when one request pulls a lot more data than others was (Author: vijay2...@yahoo.com): Just my 2 cents: I like the idea of dispatching to a separate TPE directly from thrift/native removing the need to switch ctx, how about continuing the work in CASSANDRA-5239 and getting rid of io threads? (In addition we can use the timeout thread to speculate and making storage proxy completely async post 5239?)... In a separate note shouldn't we throttle on the number of disk read from the disk instead of concurrent_writers and reads? its un fair on the threads when one request pulls a lot more data than others Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to read stage Key: CASSANDRA-6995 URL: https://issues.apache.org/jira/browse/CASSANDRA-6995 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jason Brown Assignee: Jason Brown Priority: Minor Labels: performance Fix For: 2.0.7 Attachments: 6995-v1.diff, syncread-stress.txt When performing a read local to a coordinator node, AbstractReadExecutor will create a new SP.LocalReadRunnable and drop it into the read stage for asynchronous execution. If you are using a client that intelligently routes read requests to a node holding the data for a given request, and are using CL.ONE/LOCAL_ONE, the enqueuing SP.LocalReadRunnable and waiting for the context switches (and possible NUMA misses) adds unneccesary latency. We can reduce that latency and improve throughput by avoiding the queueing and thread context switching by simply executing the SP.LocalReadRunnable synchronously in the request thread. Testing on a three node cluster (each with 32 cpus, 132 GB ram) yields ~10% improvement in throughput and ~20% speedup on avg/95/99 percentiles (99.9% was about 5-10% improvement). -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: sstableloader fails to stream data Patch by Vijay, reviewed by Yuki Morishita for CASSANDRA-6965
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 5b561067d - 69960afa3 sstableloader fails to stream data Patch by Vijay, reviewed by Yuki Morishita for CASSANDRA-6965 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69960afa Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69960afa Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69960afa Branch: refs/heads/cassandra-2.1 Commit: 69960afa3ec668c82d6bf42bb82bdcdf16b1bbe3 Parents: 5b56106 Author: Vijay vijay2...@gmail.com Authored: Wed Apr 2 16:41:07 2014 -0700 Committer: Vijay vijay2...@gmail.com Committed: Wed Apr 2 16:41:07 2014 -0700 -- src/java/org/apache/cassandra/streaming/StreamManager.java | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/69960afa/src/java/org/apache/cassandra/streaming/StreamManager.java -- diff --git a/src/java/org/apache/cassandra/streaming/StreamManager.java b/src/java/org/apache/cassandra/streaming/StreamManager.java index 872e524..6dfb1ac 100644 --- a/src/java/org/apache/cassandra/streaming/StreamManager.java +++ b/src/java/org/apache/cassandra/streaming/StreamManager.java @@ -76,8 +76,11 @@ public class StreamManager implements StreamManagerMBean double interDCThroughput = ((double) DatabaseDescriptor.getInterDCStreamThroughputOutboundMegabitsPerSec()) * ONE_MEGA_BYTE; mayUpdateThroughput(interDCThroughput, interDCLimiter); -isLocalDC = DatabaseDescriptor.getLocalDataCenter().equals( - DatabaseDescriptor.getEndpointSnitch().getDatacenter(peer)); +if (DatabaseDescriptor.getLocalDataCenter() != null DatabaseDescriptor.getEndpointSnitch() != null) +isLocalDC = DatabaseDescriptor.getLocalDataCenter().equals( + DatabaseDescriptor.getEndpointSnitch().getDatacenter(peer)); +else +isLocalDC = true; } private void mayUpdateThroughput(double limit, RateLimiter rateLimiter)