Re: Cassandra adding node issue (no UJ status)
Hey Rock, I've seen this occur as well. I've come to learn that in some cases, like a network blip, the join can fail. There is usually something in the log to the effect of "Stream failed" When I encounter this issue, I make an attempt to bootstrap the new node again. If that doesn't help, I run a repair on the new node. On Tue, Sep 15, 2015 at 3:14 AM, Rock Zhangwrote: > Hi All, > > > Now I got a problem everything when add new node, the data is not balance > again, just add it as new node. > > > Originally I saw the UJ status, anybody experienced this kind of issue ? I > don't know what changed . > > > ubuntu@ip-172-31-15-242:/etc/cassandra$ nodetool status rawdata > > Datacenter: DC1 > > === > > Status=Up/Down > > |/ State=Normal/Leaving/Joining/Moving > > -- AddressLoad Tokens Owns (effective) Host ID > Rack > > UN 172.31.16.191 742.15 GB 256 21.1% > 08638815-b721-46c4-b77c-af08285226db RAC1 > > UN 172.31.30.145 774.42 GB 256 21.2% > bde31dd9-ff1d-4f2f-b28d-fe54d0531c51 RAC1 > > UN 172.31.6.79592.9 GB 256 19.8% > 15795fca-5425-41cd-909c-c1756715442a RAC2 > > UN 172.31.27.186 674.42 GB 256 18.4% > 9685a476-1da7-4c6f-819e-dd4483e3345e RAC1 > > UN 172.31.7.31642.47 GB 256 19.9% > f7c8c6fb-ab37-4124-ba1a-a9a1beaecc1b RAC1 > > *UN 172.31.15.242 37.4 MB256 19.8% > c3eff010-9904-49a0-83cd-258dc5a98525 RAC1* > > UN 172.31.24.32 780.59 GB 256 20.1% > ffa58bd1-3188-440d-94c9-97166ee4b735 RAC1 > > *UN 172.31.3.4080.75 GB 256 18.9% > 01ce3f96-ebc0-4128-9ec3-ddd1a9845d51 RAC1* > > UN 172.31.31.238 756.59 GB 256 19.9% > 82d34a3b-4f12-4874-816c-7d89a0535577 RAC1 > > UN 172.31.31.99 583.68 GB 256 20.8% > 2b10194f-23d2-4bdc-bcfa-7961a149cd11 RAC2 > > > Thanks > > Rock >
Re: Nodetool repair with Load times 5
Hey Jean, Did you try running a nodetool cleanup on all your nodes, perhaps one at a time? On Tue, Aug 18, 2015 at 3:59 AM, Jean Tremblay jean.tremb...@zen-innovations.com wrote: Hi, I have a phenomena I cannot explain, and I would like to understand. I’m running Cassandra 2.1.8 on a cluster of 5 nodes. I’m using replication factor 3, with most default settings. Last week I done a nodetool status which gave me on each node a load of about 200 GB. Since then there was no deletes no inserts. This weekend I did a nodetool -h 192.168.2.100 repair -pr -par -inc And now when I make a nodetool status I see completely a new picture!! nodetool -h zennode0 status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens OwnsHost ID Rack UN 192.168.2.104 940.73 GB 256 ? c13e0858-091c-47c4-8773-6d6262723435 rack1 UN 192.168.2.100 1.07 TB256 ? c32a9357-e37e-452e-8eb1-57d86314b419 rack1 UN 192.168.2.101 189.03 GB 256 ? 9af90dea-90b3-4a8a-b88a-0aeabe3cea79 rack1 UN 192.168.2.102 951.28 GB 256 ? 8eb7a5bb-6903-4ae1-a372-5436d0cc170c rack1 UN 192.168.2.103 196.54 GB 256 ? 9efc6f13-2b02-4400-8cde-ae831feb86e9 rack1 The nodes 192.168.2.101 and 103 are about what they were last week, but now the three other nodes have a load which is about 5 times bigger! 1) Is this normal? 2) What is the meaning of the column Load? 3) Is there anything to fix? Can I leave it like that? Strange I’m asking to fix after I did a *repair*. Thanks a lot for your help. Kind regards Jean
RepairException on C* 2.1.3
I'm receiving an exception when I run a repair process via: 'nodetool repair -par keyspace' I'm not sure if this is a bug or not but was curious to know if there was something that can be done to remedy this situation? Full stack trace from the logs: ERROR [ValidationExecutor:3] 2015-04-17 18:16:56,174 Validator.java:232 - Failed creating a merkle tree for [repair #ee449ac0-e52d-11e4-bce7-8dc78829adc8 on mykeyspace/mycolumnfamily (-6047392565169616230,-6042578405807739912]], /10.0.111.229 (see log for details) INFO [AntiEntropySessions:1] 2015-04-17 18:16:56,175 RepairSession.java:260 - [repair #ee450ff0-e52d-11e4-bce7-8dc78829adc8] new session: will sync /10.0.111.229, /10.0.112.183 on range (-301812044562523205,-262462695890469432] for mykeyspace.[mycolumnfamily] INFO [AntiEntropySessions:1] 2015-04-17 18:16:56,175 RepairJob.java:163 - [repair #ee450ff0-e52d-11e4-bce7-8dc78829adc8] requesting merkle trees for mycolumnfamily (to [/10.0.112.183, /10.0.111.229]) ERROR [AntiEntropySessions:2] 2015-04-17 18:16:56,181 RepairSession.java:303 - [repair #ee449ac0-e52d-11e4-bce7-8dc78829adc8] session completed with the following error org.apache.cassandra.exceptions.RepairException: [repair #ee449ac0-e52d-11e4-bce7-8dc78829adc8 on mykeyspace/mycolumnfamily, (-6047392565169616230,-6042578405807739912]] Validation failed in / 10.0.111.229 at org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:403) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:132) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] ERROR [ValidationExecutor:3] 2015-04-17 18:16:56,181 CassandraDaemon.java:167 - Exception in thread Thread[ValidationExecutor:3,1,main] java.lang.NullPointerException: null at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:1277) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.io.sstable.SSTableScanner.getScanner(SSTableScanner.java:62) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1640) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1629) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.compaction.LeveledCompactionStrategy$LeveledScanner.init(LeveledCompactionStrategy.java:262) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getScanners(LeveledCompactionStrategy.java:189) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getScanners(WrappingCompactionStrategy.java:357) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:979) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:95) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:617) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] ERROR [AntiEntropySessions:2] 2015-04-17 18:16:56,182 CassandraDaemon.java:167 - Exception in thread Thread[AntiEntropySessions:2,5,RMI Runtime] java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #ee449ac0-e52d-11e4-bce7-8dc78829adc8 on mykeyspace/mycolumnfamily, (-6047392565169616230,-6042578405807739912]] Validation failed in / 10.0.111.229 at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
Re: Column value not getting updated
Hey Saurabh, We're actually preparing for this ourselves and spinning up our own NTP server pool. The public NTP pools have a lot of drift and should not be relied upon for cluster technology that is sensitive to time skew like C*. The folks at Logentries did a great write up about this which we used as a guide. - https://blog.logentries.com/2014/03/synchronizing-clocks-in-a-cassandra-cluster-pt-1-the-problem/ - https://blog.logentries.com/2014/03/synchronizing-clocks-in-a-cassandra-cluster-pt-2-solutions/ -Mark On Tue, Mar 31, 2015 at 5:59 PM, Saurabh Sethi saurabh_se...@symantec.com wrote: That’s what I found out that the clocks were not in sync. But I have setup NTP on all 3 nodes and would expect the clocks to be in sync. From: Nate McCall n...@thelastpickle.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Tuesday, March 31, 2015 at 2:50 PM To: Cassandra Users user@cassandra.apache.org Subject: Re: Column value not getting updated You would see that if the servers' clocks were out of sync. Make sure the time on the servers is in sync or set the client timestamps explicitly. On Tue, Mar 31, 2015 at 3:23 PM, Saurabh Sethi saurabh_se...@symantec.com wrote: I have written a unit test that creates a column family, inserts a row in that column family and then updates the value of one of the columns. After updating, unit test immediately tries to read the updated value for that column, but Cassandra returns the old value. - I am using QueryBuilder API and not CQL directly. - I am using the consistency level of QUORUM for everything – insert, update and read. - Cassandra is running as a 3 node cluster with replication factor of 3. Anyone has any idea what is going on here? Thanks, Saurabh -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com
C* 2.1.3 - Incremental replacement of compacted SSTables
I saw in the NEWS.txt that this has been disabled. Does anyone know why that was the case? Is it temporary just for the 2.1.3 release? Thanks, Mark Greene
Re: High Bloom Filter FP Ratio
We're seeing similar behavior except our FP ratio is closer to 1.0 (100%). We're using Cassandra 2.1.2. Schema --- CREATE TABLE contacts.contact ( id bigint, property_id int, created_at bigint, updated_at bigint, value blob, PRIMARY KEY (id, property_id) ) WITH CLUSTERING ORDER BY (property_id ASC) *AND bloom_filter_fp_chance = 0.001* AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CF Stats Output: - Keyspace: contacts Read Count: 2458375 Read Latency: 0.852844076675 ms. Write Count: 10357 Write Latency: 0.1816912233272183 ms. Pending Flushes: 0 Table: contact SSTable count: 61 SSTables in each level: [1, 10, 50, 0, 0, 0, 0, 0, 0] Space used (live): 9047112471 Space used (total): 9047112471 Space used by snapshots (total): 0 SSTable Compression Ratio: 0.34119240020241487 Memtable cell count: 24570 Memtable data size: 1299614 Memtable switch count: 2 Local read count: 2458290 Local read latency: 0.853 ms Local write count: 10044 Local write latency: 0.186 ms Pending flushes: 0 Bloom filter false positives: 11096 *Bloom filter false ratio: 0.99197* Bloom filter space used: 3923784 Compacted partition minimum bytes: 373 Compacted partition maximum bytes: 152321 Compacted partition mean bytes: 9938 Average live cells per slice (last five minutes): 37.57851240677983 Maximum live cells per slice (last five minutes): 63.0 Average tombstones per slice (last five minutes): 0.0 Maximum tombstones per slice (last five minutes): 0.0 -- about.me http://about.me/markgreene On Wed, Dec 17, 2014 at 1:32 PM, Chris Hart ch...@remilon.com wrote: Hi, I have create the following table with bloom_filter_fp_chance=0.01: CREATE TABLE logged_event ( time_key bigint, partition_key_randomizer int, resource_uuid timeuuid, event_json text, event_type text, field_error_list maptext, text, javascript_timestamp timestamp, javascript_uuid uuid, page_impression_guid uuid, page_request_guid uuid, server_received_timestamp timestamp, session_id bigint, PRIMARY KEY ((time_key, partition_key_randomizer), resource_uuid) ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; When I run cfstats, I see a much higher false positive ratio: Table: logged_event SSTable count: 15 Space used (live), bytes: 104128214227 Space used (total), bytes: 104129482871 SSTable Compression Ratio: 0.3295840184239226 Number of keys (estimate): 199293952 Memtable cell count: 56364 Memtable data size, bytes: 20903960 Memtable switch count: 148 Local read count: 1396402 Local read latency: 0.362 ms Local write count: 2345306 Local write latency: 0.062 ms Pending tasks: 0 Bloom filter false positives: 147705 Bloom filter false ratio: 0.49020 Bloom filter space used, bytes: 249129040 Compacted partition minimum bytes: 447 Compacted partition maximum bytes: 315852 Compacted partition mean bytes: 1636 Average live cells per slice (last five minutes): 0.0 Average tombstones per slice (last five minutes): 0.0 Any idea what could be causing this? This is timeseries data. Every time we read from this table, we read a single row key with 1000 partition_key_randomizer values. I'm running cassandra 2.0.11. I tried running an upgradesstables to rewrite
Error when dropping keyspaces; One row required, 0 found
I'm running Cassandra 2.1.0. I was attempting to drop two keyspaces via cqlsh and encountered an error in the CLI as well as the appearance of losing all my keyspaces. Below is the output from my cqlsh session: $ cqlsh Connected to Production Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 2.1.0 | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh desc keyspaces; contacts_index contacts_testing contacts system OpsCenter system_traces cqlsh drop keyspace contacts_index; cqlsh drop keyspace contacts; ErrorMessage code= [Server error] message=java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NullPointerException cqlsh drop keyspace contacts; ErrorMessage code= [Server error] message=java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.IllegalStateException: One row required, 0 found *cqlsh desc keyspaces;empty* -- OH SHIT -- After it appeared that I had lost all my keyspaces, I looked at the system.log and found this: (full log attached) ERROR [MigrationStage:1] 2014-12-01 23:54:05,622 CassandraDaemon.java:166 - Exception in thread Thread[MigrationStage:1,5,main] java.lang.IllegalStateException: One row required, 0 found at org.apache.cassandra.cql3.UntypedResultSet$FromResultSet.one(UntypedResultSet.java:78) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.config.KSMetaData.fromSchema(KSMetaData.java:275) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.DefsTables.mergeKeyspaces(DefsTables.java:230) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.DefsTables.mergeSchemaInternal(DefsTables.java:186) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:164) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:49) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.0.jar:2.1.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_65] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_65] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] At this point I wasn't sure quite what to do about this and did a rolling restart of the entire ring. After which, the keyspaces that were not attempted to be deleted returned when running 'desc keyspaces' and my intended keyspaces to be deleted had been removed as expected. Strangely enough, because we run OpsCenter, we lost the dashboards we had configured. Not a total deal breaker, but concerning that data loss occurred here assuming it's related. Anyone run into something like this before? system.log Description: Binary data
Dynamic Columns in Cassandra 2.X
I'm looking for some best practices w/r/t supporting arbitrary columns. It seems from the docs I've read around CQL that they are supported in some capacity via collections but you can't exceed 64K in size. For my requirements that would cause problems. So my questions are: 1) Is using Thrift a valid approach in the era of CQL? 2) If CQL is the best practice, should I alter the schema at runtime when I detect I need to do an schema mutation? 3) If I utilize CQL collections, will Cassandra page the entire thing into the heap? My data model is akin to a CRM, arbitrary column definitions per customer. Cheers, Mark
Re: Dynamic Columns in Cassandra 2.X
Thanks DuyHai, I have a follow up question to #2. You mentioned ideally I would create a new table instead of mutating an existing one. This strikes me as bad practice in the world of multi tenant systems. I don't want to create a table per customer. So I'm wondering if dynamically modifying the table is an accepted practice? -- about.me http://about.me/markgreene On Fri, Jun 13, 2014 at 2:54 PM, DuyHai Doan doanduy...@gmail.com wrote: Hello Mark Dynamic columns, as you said, are perfectly supported by CQL3 via clustering columns. And no, using collections for storing dynamic data is a very bad idea if the cardinality is very high ( 1000 elements) 1) Is using Thrift a valid approach in the era of CQL? -- Less and less. Unless you are looking for extreme performance, you'd better off choosing CQL3. The ease of programming and querying with CQL3 does worth the small overhead in CPU 2) If CQL is the best practice, should I alter the schema at runtime when I detect I need to do an schema mutation? -- Ideally you should not alter schema but create a new table to adapt to your changing requirements. 3) If I utilize CQL collections, will Cassandra page the entire thing into the heap? -- Of course. All collections and maps in Cassandra are eagerly loaded entirely in memory on server side. That's why it is recommended to limit their cardinality to ~ 1000 elements On Fri, Jun 13, 2014 at 8:33 PM, Mark Greene green...@gmail.com wrote: I'm looking for some best practices w/r/t supporting arbitrary columns. It seems from the docs I've read around CQL that they are supported in some capacity via collections but you can't exceed 64K in size. For my requirements that would cause problems. So my questions are: 1) Is using Thrift a valid approach in the era of CQL? 2) If CQL is the best practice, should I alter the schema at runtime when I detect I need to do an schema mutation? 3) If I utilize CQL collections, will Cassandra page the entire thing into the heap? My data model is akin to a CRM, arbitrary column definitions per customer. Cheers, Mark
Re: Dynamic Columns in Cassandra 2.X
My use case requires the support of arbitrary columns much like a CRM. My users can define 'custom' fields within the application. Ideally I wouldn't have to change the schema at all, which is why I like the old thrift approach rather than the CQL approach. Having said all that, I'd be willing to adapt my API to make explicit schema changes to Cassandra whenever my user makes a change to their custom fields if that's an accepted practice. Ultimately, I'm trying to figure out of the Cassandra community intends to support true schemaless use cases in the future. -- about.me http://about.me/markgreene On Fri, Jun 13, 2014 at 3:47 PM, DuyHai Doan doanduy...@gmail.com wrote: This strikes me as bad practice in the world of multi tenant systems. I don't want to create a table per customer. So I'm wondering if dynamically modifying the table is an accepted practice? -- Can you give some details about your use case ? How would you alter a table structure to adapt it to a new customer ? Wouldn't it be better to model your table so that it supports addition/removal of customer ? On Fri, Jun 13, 2014 at 9:00 PM, Mark Greene green...@gmail.com wrote: Thanks DuyHai, I have a follow up question to #2. You mentioned ideally I would create a new table instead of mutating an existing one. This strikes me as bad practice in the world of multi tenant systems. I don't want to create a table per customer. So I'm wondering if dynamically modifying the table is an accepted practice? -- about.me http://about.me/markgreene On Fri, Jun 13, 2014 at 2:54 PM, DuyHai Doan doanduy...@gmail.com wrote: Hello Mark Dynamic columns, as you said, are perfectly supported by CQL3 via clustering columns. And no, using collections for storing dynamic data is a very bad idea if the cardinality is very high ( 1000 elements) 1) Is using Thrift a valid approach in the era of CQL? -- Less and less. Unless you are looking for extreme performance, you'd better off choosing CQL3. The ease of programming and querying with CQL3 does worth the small overhead in CPU 2) If CQL is the best practice, should I alter the schema at runtime when I detect I need to do an schema mutation? -- Ideally you should not alter schema but create a new table to adapt to your changing requirements. 3) If I utilize CQL collections, will Cassandra page the entire thing into the heap? -- Of course. All collections and maps in Cassandra are eagerly loaded entirely in memory on server side. That's why it is recommended to limit their cardinality to ~ 1000 elements On Fri, Jun 13, 2014 at 8:33 PM, Mark Greene green...@gmail.com wrote: I'm looking for some best practices w/r/t supporting arbitrary columns. It seems from the docs I've read around CQL that they are supported in some capacity via collections but you can't exceed 64K in size. For my requirements that would cause problems. So my questions are: 1) Is using Thrift a valid approach in the era of CQL? 2) If CQL is the best practice, should I alter the schema at runtime when I detect I need to do an schema mutation? 3) If I utilize CQL collections, will Cassandra page the entire thing into the heap? My data model is akin to a CRM, arbitrary column definitions per customer. Cheers, Mark
Re: Dynamic Columns in Cassandra 2.X
Yeah I don't anticipate more than 1000 properties, well under in fact. I guess the trade off of using the clustered columns is that I'd have a table that would be tall and skinny which also has its challenges w/r/t memory. I'll look into your suggestion a bit more and consider some others around a hybrid of CQL and Thrift (where necssary). But from a newb's perspective, I sense the community is unsettled around this concept of truly dynamic columns. Coming from an HBase background, it's a consideration I didn't anticipate having to evaluate. -- about.me http://about.me/markgreene On Fri, Jun 13, 2014 at 4:19 PM, DuyHai Doan doanduy...@gmail.com wrote: Hi Mark I believe that in your table you want to have some common fields that will be there whatever customer is, and other fields that are entirely customer-dependent, isn't it ? In this case, creating a table with static columns for the common fields and a clustering column representing all custom fields defined by a customer could be a solution (see here for static column: https://issues.apache.org/jira/browse/CASSANDRA-6561 ) CREATE TABLE user_data ( user_id bigint, user_firstname text static, user_lastname text static, ... custom_property_name text, custom_property_value text, PRIMARY KEY(user_id, custom_property_name, custom_property_value)); Please note that with this solution you need to have at least one custom property per customer to make it work The only thing to take care of is the type of custom_property_value. You need to define it once for all. To accommodate for dynamic types, you can either save the value as blob or text(as JSON) and take care of the serialization/deserialization yourself at the client side As an alternative you can save custom properties in a map, provided that their number is not too large. But considering the business case of CRM, I believe that it's quite rare and user has more than 1000 custom properties isn't it ? On Fri, Jun 13, 2014 at 10:03 PM, Mark Greene green...@gmail.com wrote: My use case requires the support of arbitrary columns much like a CRM. My users can define 'custom' fields within the application. Ideally I wouldn't have to change the schema at all, which is why I like the old thrift approach rather than the CQL approach. Having said all that, I'd be willing to adapt my API to make explicit schema changes to Cassandra whenever my user makes a change to their custom fields if that's an accepted practice. Ultimately, I'm trying to figure out of the Cassandra community intends to support true schemaless use cases in the future. -- about.me http://about.me/markgreene On Fri, Jun 13, 2014 at 3:47 PM, DuyHai Doan doanduy...@gmail.com wrote: This strikes me as bad practice in the world of multi tenant systems. I don't want to create a table per customer. So I'm wondering if dynamically modifying the table is an accepted practice? -- Can you give some details about your use case ? How would you alter a table structure to adapt it to a new customer ? Wouldn't it be better to model your table so that it supports addition/removal of customer ? On Fri, Jun 13, 2014 at 9:00 PM, Mark Greene green...@gmail.com wrote: Thanks DuyHai, I have a follow up question to #2. You mentioned ideally I would create a new table instead of mutating an existing one. This strikes me as bad practice in the world of multi tenant systems. I don't want to create a table per customer. So I'm wondering if dynamically modifying the table is an accepted practice? -- about.me http://about.me/markgreene On Fri, Jun 13, 2014 at 2:54 PM, DuyHai Doan doanduy...@gmail.com wrote: Hello Mark Dynamic columns, as you said, are perfectly supported by CQL3 via clustering columns. And no, using collections for storing dynamic data is a very bad idea if the cardinality is very high ( 1000 elements) 1) Is using Thrift a valid approach in the era of CQL? -- Less and less. Unless you are looking for extreme performance, you'd better off choosing CQL3. The ease of programming and querying with CQL3 does worth the small overhead in CPU 2) If CQL is the best practice, should I alter the schema at runtime when I detect I need to do an schema mutation? -- Ideally you should not alter schema but create a new table to adapt to your changing requirements. 3) If I utilize CQL collections, will Cassandra page the entire thing into the heap? -- Of course. All collections and maps in Cassandra are eagerly loaded entirely in memory on server side. That's why it is recommended to limit their cardinality to ~ 1000 elements On Fri, Jun 13, 2014 at 8:33 PM, Mark Greene green...@gmail.com wrote: I'm looking for some best practices w/r/t supporting arbitrary columns. It seems from the docs I've read around CQL that they are supported in some capacity via collections but you can't exceed 64K in size. For my requirements
Re: ec2 tests
If you give us an objective of the test that will help. Trying to get max write throughput? Read throughput? Weak consistency? On Thu, May 27, 2010 at 8:48 PM, Chris Dean ctd...@sokitomi.com wrote: I'm interested in performing some simple performance tests on EC2. I was thinking of using py_stress and Cassandra deployed on 3 servers with one separate machine to run py_stress. Are there any particular configuration settings I should use? I was planning on changing the JVM heap size to reflect the Large Instances we're using. Thanks! Cheers, Chris Dean
Re: ec2 tests
First thing I would do is stripe your EBS volumes. I've seen blogs that say this helps and blogs that say it's fairly marginal. (You may want to try rackspace cloud as they're local storage is much faster.) Second, I would start out with N=2 and set W=1 and R=1. That will mirror your data across two of the three nodes and possibly give you stale data on the reads. If you feel you need stronger durability you increase N and W. As far as heap memory, don't use 100% of the available physical ram. Remember, object heap will be smaller than your overall JVM process heap. That should get you started. On Fri, May 28, 2010 at 3:10 AM, Chris Dean ctd...@sokitomi.com wrote: Mark Greene green...@gmail.com writes: If you give us an objective of the test that will help. Trying to get max write throughput? Read throughput? Weak consistency? I would like reading to be as fast as I can get. My real-world problem is write heavy, but the latency requirements are minimal on that side. If there are any particular config setting that would help with the slow ec2 IO that would be great to know. Cheers, Chris Dean
Re: Why are writes faster than reads?
I'm fairly certain the write path hits the commit log first, then the memtable. 2010/5/25 Peter Schüller sc...@spotify.com I have seen several off-hand mentions that writes are inherently faster than reads. Why is this so? I believe the primary factor people are referring to is that writes are faster than reads in terms of disk I/O because writes are inherently sequential. Writes initially only happen in-memory plus in a (sequentially written) commit log; when flushed out to an sstable that is likewise sequential writing. Reads on the other hand, to the extent that they go down to disk, will suffer the usual overhead associated with disk seeks. See http://wiki.apache.org/cassandra/ArchitectureInternals for details. -- / Peter Schuller aka scode
delete mutation
Is there a particular reason why timestamp is required to do a deletion? If i'm reading the api docs correctly, this would require a read of the column first correct? I know there is an issue filed to have a better way to delete via range slices but I wanted to make sure this was the only way to do a delete. Thanks in advance, Mark
Re: Problems running Cassandra 0.6.1 on large EC2 instances.
Can you provide us with the current JVM args? Also, what type of work load you are giving the ring (op/s)? On Mon, May 17, 2010 at 6:39 PM, Curt Bererton c...@zipzapplay.com wrote: Hello Cassandra users+experts, Hopefully someone will be able to point me in the correct direction. We have cassandra 0.6.1 working on our test servers and we *thought* everything was great and ready to move to production. We are currently running a ring of 4 large instance EC2 (http://aws.amazon.com/ec2/instance-types/) servers on production with a replication factor of 3 and a QUORUM consistency level. We ran a test on 1% of our users, and everything was writing to and reading from cassandra great for the first 3 hours. After that point CPU usage spiked to 100% and stayed there, basically on all 4 machines at once. This smells to me like a GC issue, and I'm looking into it with jconsole right now. If anyone can help me debug this and get cassandra all the way up and running without CPU spiking I would be forever in their debt. I suspect that anyone else running cassandra on large EC2 instances might just be able to tell me what JVM args they are successfully using in a production environment and if they upgraded to Cassandra 0.6.2 from 0.6.1, and did they go to batched writes due to bug 1014? ( https://issues.apache.org/jira/browse/CASSANDRA-1014) That might answer all my questions. Is there anyone on the list who is using large EC2 instances in production? Would you be kind enough to share your JVM arguments and any other tips? Thanks for any help, Curt -- Curt, ZipZapPlay Inc., www.PlayCrafter.com, http://apps.facebook.com/happyhabitat
Re: replication impact on write throughput
If you have for example, your replication factor equal to the total amount of nodes in the ring, I suspect you will hit a brick wall pretty soon. The biggest impact on your write performance will most likely be the consistency level of your writes. In other words, how many nodes you want to wait for before you acknowledge the write back to the client. On Tue, May 11, 2010 at 12:10 PM, Bill de hOra b...@dehora.net wrote: If I had 10 Cassandra nodes each with a write capacity of 5K per second and a replication factor of 2, would that mean the expected write capacity of the system would be ~25K writes per second because the nodes are also serving other nodes and not just clients? I know this is highly simplified take on things (ie no consideration for reads or quorum), I'm just trying to understand what the implication of replication is on write scalability. Intuitively it would seem actual write capacity is total write capacity divided by the replication factor. Bill
Re: replication impact on write throughput
I was under the impression from what I've seen talked about on this list (perhaps I'm wrong here) that given the write throughput of one node in a cluster (again assuming each node has a given throughput and the same config) that you would simply multiply that throughput by the number of nodes you had, giving you the total throughput for the entire ring (like you said this is somewhat artificial). The main benefit being that adding capacity was as simple as adding more nodes to the ring with no degradation. So by your math, 100 nodes with each node getting 5k wps, I would assume the total capacity is 500k wps. But perhaps I've misunderstood some key concepts. Still a novice myself ;-) On Tue, May 11, 2010 at 3:08 PM, Bill de hOra b...@dehora.net wrote: Mark Greene wrote: If you have for example, your replication factor equal to the total amount of nodes in the ring, I suspect you will hit a brick wall pretty soon. Right :) So if we said there was 100 nodes at 5K wps with R=2, then would that suggest the cluster can support 250K wps? Again, I know this is a tad artificial, just trying to understand the impact of replication on writes. The biggest impact on your write performance will most likely be the consistency level of your writes. In other words, how many nodes you want to wait for before you acknowledge the write back to the client. I'd agree for any individual client; what I'm after is the overall capacity a cluster has over time in the face of replicas. But let's assume it's ConsistencyLevel.ONE - how would you think the available write capacity degrades? Bill On Tue, May 11, 2010 at 12:10 PM, Bill de hOra b...@dehora.net mailto: b...@dehora.net wrote: If I had 10 Cassandra nodes each with a write capacity of 5K per second and a replication factor of 2, would that mean the expected write capacity of the system would be ~25K writes per second because the nodes are also serving other nodes and not just clients? I know this is highly simplified take on things (ie no consideration for reads or quorum), I'm just trying to understand what the implication of replication is on write scalability. Intuitively it would seem actual write capacity is total write capacity divided by the replication factor. Bill
Re: pagination through slices with deleted keys
Hey Ian, I actually just wrote a quick example of how to iterate over a CF that may have tombstones. This may help you out: http://markjgreene.wordpress.com/2010/05/05/iterate-over-entire-cassandra-column-family/ On Thu, May 6, 2010 at 12:17 PM, Ian Kallen spidaman.l...@gmail.com wrote: I read the DistributedDeletes and the range_ghosts FAQ entry on the wiki which do a good job describing how difficult deletion is in an eventually consistent system. But practical application strategies for dealing with it aren't there (that I saw). I'm wondering how folks implement pagination in their applications; if you want to render N results in an application, is the only solution to over-fetch and filter out the tombstones? Or is there something simpler that I overlooked? I'd like to be able to count (even if the counts are approximate) and fetch rows with the deleted ones filtered out (without waiting for the GCGraceSeconds interval + compaction) but from what I see so far, the burden is on the app to deal with the tombstones. -Ian
Re: value size, is there a suggested limit?
http://wiki.apache.org/cassandra/CassandraLimitations On Sun, Apr 25, 2010 at 4:19 PM, S Ahmed sahmed1...@gmail.com wrote: Is there a suggested sized maximum that you can set the value of a given key? e.g. could I convert a document to bytes and store it as a value to a key? if yes, which I presume so, what if the file is 10mb? or 100mb?
Re: getting cassandra setup on windows 7
Try the cassandra-with-fixes.bathttps://issues.apache.org/jira/secure/attachment/12442349/cassandra-with-fixes.bat file attached to the issue. I had the same issue an that bat file got cassandra to start. It still throws another error complaining about the log4j.properties. On Fri, Apr 23, 2010 at 1:59 PM, S Ahmed sahmed1...@gmail.com wrote: Any insights? Much appreciated! On Thu, Apr 22, 2010 at 11:13 PM, S Ahmed sahmed1...@gmail.com wrote: I was just reading that thanks. What does he mean when he says: This appears to be related to data storage paths I set, because if I switch the paths back to the default UNIX paths. Everything runs fine On Thu, Apr 22, 2010 at 11:07 PM, Jonathan Ellis jbel...@gmail.comwrote: https://issues.apache.org/jira/browse/CASSANDRA-948 On Thu, Apr 22, 2010 at 10:03 PM, S Ahmed sahmed1...@gmail.com wrote: Ok so I found the config section: CommitLogDirectoryE:\java\cassandra\apache-cassandra-0.6.1-bin\apache-cassandra-0.6.1\commitlog/CommitLogDirectory DataFileDirectories DataFileDirectoryE:\java\cassandra\apache-cassandra-0.6.1-bin\apache-cassandra-0.6.1\data/DataFileDirectory /DataFileDirectories Now when I run: bin/cassandra I get: Starting cassandra server listening for transport dt_socket at address: exception in thread main java.lang.noclassDefFoundError: org/apache/cassthreft/cassandraDaemon could not find the main class: org.apache.cassandra.threif.cassandraDaemon... On Thu, Apr 22, 2010 at 10:53 PM, S Ahmed sahmed1...@gmail.com wrote: So I uncompressed the .tar, in the readme it says: * tar -zxvf cassandra-$VERSION.tgz * cd cassandra-$VERSION * sudo mkdir -p /var/log/cassandra * sudo chown -R `whoami` /var/log/cassandra * sudo mkdir -p /var/lib/cassandra * sudo chown -R `whoami` /var/lib/cassandra My cassandra is at: c:\java\cassandra\apache-cassandra-0.6.1/ So I have to create 2 folders log and lib? Is there a setting in a config file that I edit?
Re: Cassandra tuning for running test on a desktop
RAM doesn't necessarily need to be proportional but I would say the number of nodes does. You can't just throw a bazillion inserts at one node. This is the main benefit of Cassandra is that if you start hitting your capacity, you add more machines and distribute the keys across more machines. On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot nith...@gmail.com wrote: So does it means the RAM needed is proportionnal with the data handled ? Or Cassandra need a minimum amount or RAM when dataset is big? I must confess this OOM behaviour is strange. On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones mjo...@imagehawk.com wrote: On my 4GB machine I’m giving it 3GB and having no trouble with 60+ million 500 byte columns *From:* Nicolas Labrot [mailto:nith...@gmail.com] *Sent:* Wednesday, April 21, 2010 7:47 AM *To:* user@cassandra.apache.org *Subject:* Re: Cassandra tuning for running test on a desktop I have try 1400M, and Cassandra OOM too. Is there another solution ? My data isn't very big. It seems that is the merge of the db On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene green...@gmail.com wrote: Trying increasing Xmx. 1G is probably not enough for the amount of inserts you are doing. On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot nith...@gmail.com wrote: Hello, For my first message I will first thanks Cassandra contributors for their great works. I have a parameter issue with Cassandra (I hope it's just a parameter issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a simple dual core with 4GB of RAM on WinXP. I have keep the default JVM option inside cassandra.bat (Xmx1G) I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF (named Super1). The insertion go to 1 millions of SC (without slowdown) and Cassandra crash because of an OOM. (I store an average of 100 bytes per SC with a max of 10kB). I have aggressively decreased all the memories parameters without any respect to the consistency (My config is here [1]), the cache is turn off but Cassandra still go to OOM. I have joined the last line of the Cassandra life [2]. What can I do to fix my issue ? Is there another solution than increasing the Xmx ? Thanks for your help, Nicolas [1] Keyspaces Keyspace Name=Keyspace1 ColumnFamily Name=Super1 ColumnType=Super CompareWith=BytesType CompareSubcolumnsWith=BytesType / ReplicaPlacementStrategyorg.apache.cassandra.locator.RackUnawareStrategy/ReplicaPlacementStrategy ReplicationFactor1/ReplicationFactor EndPointSnitchorg.apache.cassandra.locator.EndPointSnitch/EndPointSnitch /Keyspace /Keyspaces CommitLogRotationThresholdInMB32/CommitLogRotationThresholdInMB DiskAccessModeauto/DiskAccessMode RowWarningThresholdInMB64/RowWarningThresholdInMB SlicedBufferSizeInKB64/SlicedBufferSizeInKB FlushDataBufferSizeInMB16/FlushDataBufferSizeInMB FlushIndexBufferSizeInMB4/FlushIndexBufferSizeInMB ColumnIndexSizeInKB64/ColumnIndexSizeInKB MemtableThroughputInMB16/MemtableThroughputInMB BinaryMemtableThroughputInMB32/BinaryMemtableThroughputInMB MemtableOperationsInMillions0.01/MemtableOperationsInMillions MemtableObjectCountInMillions0.01/MemtableObjectCountInMillions MemtableFlushAfterMinutes60/MemtableFlushAfterMinutes ConcurrentReads4/ConcurrentReads ConcurrentWrites8/ConcurrentWrites /Storage [2] INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', position=5417524) INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755 INFO 13:36:41,062 Writing Memtable(Super1)@15385755 INFO 13:36:42,062 Completed flushing d:\cassandra\data\Keyspace1\Super1-711-Data.db INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', position=6065637) INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910 INFO 13:36:45,796 Writing Memtable(Super1)@15578910 INFO 13:36:46,109 Completed flushing d:\cassandra\data\Keyspace1\Super1-712-Data.db INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240 reclaimed leaving 922392600 used; max is 1174208512 INFO 13:36:54,593 Super1 has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', position=6722241) INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872 INFO 13:36:54,593 Writing Memtable(Super1)@24468872 INFO 13:36:55,421 Completed flushing d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError: Java heap space INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432 reclaimed leaving 971904520 used; max is 1174208512
Re: Cassandra tuning for running test on a desktop
Hit send to early That being said a lot of people running Cassandra in production are using 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully gives you some perspective. On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene green...@gmail.com wrote: RAM doesn't necessarily need to be proportional but I would say the number of nodes does. You can't just throw a bazillion inserts at one node. This is the main benefit of Cassandra is that if you start hitting your capacity, you add more machines and distribute the keys across more machines. On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot nith...@gmail.com wrote: So does it means the RAM needed is proportionnal with the data handled ? Or Cassandra need a minimum amount or RAM when dataset is big? I must confess this OOM behaviour is strange. On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones mjo...@imagehawk.com wrote: On my 4GB machine I’m giving it 3GB and having no trouble with 60+ million 500 byte columns *From:* Nicolas Labrot [mailto:nith...@gmail.com] *Sent:* Wednesday, April 21, 2010 7:47 AM *To:* user@cassandra.apache.org *Subject:* Re: Cassandra tuning for running test on a desktop I have try 1400M, and Cassandra OOM too. Is there another solution ? My data isn't very big. It seems that is the merge of the db On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene green...@gmail.com wrote: Trying increasing Xmx. 1G is probably not enough for the amount of inserts you are doing. On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot nith...@gmail.com wrote: Hello, For my first message I will first thanks Cassandra contributors for their great works. I have a parameter issue with Cassandra (I hope it's just a parameter issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a simple dual core with 4GB of RAM on WinXP. I have keep the default JVM option inside cassandra.bat (Xmx1G) I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF (named Super1). The insertion go to 1 millions of SC (without slowdown) and Cassandra crash because of an OOM. (I store an average of 100 bytes per SC with a max of 10kB). I have aggressively decreased all the memories parameters without any respect to the consistency (My config is here [1]), the cache is turn off but Cassandra still go to OOM. I have joined the last line of the Cassandra life [2]. What can I do to fix my issue ? Is there another solution than increasing the Xmx ? Thanks for your help, Nicolas [1] Keyspaces Keyspace Name=Keyspace1 ColumnFamily Name=Super1 ColumnType=Super CompareWith=BytesType CompareSubcolumnsWith=BytesType / ReplicaPlacementStrategyorg.apache.cassandra.locator.RackUnawareStrategy/ReplicaPlacementStrategy ReplicationFactor1/ReplicationFactor EndPointSnitchorg.apache.cassandra.locator.EndPointSnitch/EndPointSnitch /Keyspace /Keyspaces CommitLogRotationThresholdInMB32/CommitLogRotationThresholdInMB DiskAccessModeauto/DiskAccessMode RowWarningThresholdInMB64/RowWarningThresholdInMB SlicedBufferSizeInKB64/SlicedBufferSizeInKB FlushDataBufferSizeInMB16/FlushDataBufferSizeInMB FlushIndexBufferSizeInMB4/FlushIndexBufferSizeInMB ColumnIndexSizeInKB64/ColumnIndexSizeInKB MemtableThroughputInMB16/MemtableThroughputInMB BinaryMemtableThroughputInMB32/BinaryMemtableThroughputInMB MemtableOperationsInMillions0.01/MemtableOperationsInMillions MemtableObjectCountInMillions0.01/MemtableObjectCountInMillions MemtableFlushAfterMinutes60/MemtableFlushAfterMinutes ConcurrentReads4/ConcurrentReads ConcurrentWrites8/ConcurrentWrites /Storage [2] INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', position=5417524) INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755 INFO 13:36:41,062 Writing Memtable(Super1)@15385755 INFO 13:36:42,062 Completed flushing d:\cassandra\data\Keyspace1\Super1-711-Data.db INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', position=6065637) INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910 INFO 13:36:45,796 Writing Memtable(Super1)@15578910 INFO 13:36:46,109 Completed flushing d:\cassandra\data\Keyspace1\Super1-712-Data.db INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240 reclaimed leaving 922392600 used; max is 1174208512 INFO 13:36:54,593 Super1 has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', position=6722241) INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872 INFO 13:36:54,593 Writing Memtable(Super1)@24468872 INFO 13:36:55,421 Completed
Re: Cassandra tuning for running test on a desktop
Maybe, maybe not. Presumably if you are running a RDMS with any reasonable amount of traffic now a days, it's sitting on a machine with 4-8G of memory at least. On Wed, Apr 21, 2010 at 10:48 AM, Nicolas Labrot nith...@gmail.com wrote: Thanks Mark. Cassandra is maybe too much for my need ;) On Wed, Apr 21, 2010 at 4:45 PM, Mark Greene green...@gmail.com wrote: Hit send to early That being said a lot of people running Cassandra in production are using 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully gives you some perspective. On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene green...@gmail.com wrote: RAM doesn't necessarily need to be proportional but I would say the number of nodes does. You can't just throw a bazillion inserts at one node. This is the main benefit of Cassandra is that if you start hitting your capacity, you add more machines and distribute the keys across more machines. On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot nith...@gmail.comwrote: So does it means the RAM needed is proportionnal with the data handled ? Or Cassandra need a minimum amount or RAM when dataset is big? I must confess this OOM behaviour is strange. On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones mjo...@imagehawk.comwrote: On my 4GB machine I’m giving it 3GB and having no trouble with 60+ million 500 byte columns *From:* Nicolas Labrot [mailto:nith...@gmail.com] *Sent:* Wednesday, April 21, 2010 7:47 AM *To:* user@cassandra.apache.org *Subject:* Re: Cassandra tuning for running test on a desktop I have try 1400M, and Cassandra OOM too. Is there another solution ? My data isn't very big. It seems that is the merge of the db On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene green...@gmail.com wrote: Trying increasing Xmx. 1G is probably not enough for the amount of inserts you are doing. On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot nith...@gmail.com wrote: Hello, For my first message I will first thanks Cassandra contributors for their great works. I have a parameter issue with Cassandra (I hope it's just a parameter issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a simple dual core with 4GB of RAM on WinXP. I have keep the default JVM option inside cassandra.bat (Xmx1G) I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF (named Super1). The insertion go to 1 millions of SC (without slowdown) and Cassandra crash because of an OOM. (I store an average of 100 bytes per SC with a max of 10kB). I have aggressively decreased all the memories parameters without any respect to the consistency (My config is here [1]), the cache is turn off but Cassandra still go to OOM. I have joined the last line of the Cassandra life [2]. What can I do to fix my issue ? Is there another solution than increasing the Xmx ? Thanks for your help, Nicolas [1] Keyspaces Keyspace Name=Keyspace1 ColumnFamily Name=Super1 ColumnType=Super CompareWith=BytesType CompareSubcolumnsWith=BytesType / ReplicaPlacementStrategyorg.apache.cassandra.locator.RackUnawareStrategy/ReplicaPlacementStrategy ReplicationFactor1/ReplicationFactor EndPointSnitchorg.apache.cassandra.locator.EndPointSnitch/EndPointSnitch /Keyspace /Keyspaces CommitLogRotationThresholdInMB32/CommitLogRotationThresholdInMB DiskAccessModeauto/DiskAccessMode RowWarningThresholdInMB64/RowWarningThresholdInMB SlicedBufferSizeInKB64/SlicedBufferSizeInKB FlushDataBufferSizeInMB16/FlushDataBufferSizeInMB FlushIndexBufferSizeInMB4/FlushIndexBufferSizeInMB ColumnIndexSizeInKB64/ColumnIndexSizeInKB MemtableThroughputInMB16/MemtableThroughputInMB BinaryMemtableThroughputInMB32/BinaryMemtableThroughputInMB MemtableOperationsInMillions0.01/MemtableOperationsInMillions MemtableObjectCountInMillions0.01/MemtableObjectCountInMillions MemtableFlushAfterMinutes60/MemtableFlushAfterMinutes ConcurrentReads4/ConcurrentReads ConcurrentWrites8/ConcurrentWrites /Storage [2] INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', position=5417524) INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755 INFO 13:36:41,062 Writing Memtable(Super1)@15385755 INFO 13:36:42,062 Completed flushing d:\cassandra\data\Keyspace1\Super1-711-Data.db INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', position=6065637) INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910 INFO 13:36:45,796 Writing Memtable(Super1)@15578910 INFO 13:36:46,109 Completed flushing d:\cassandra\data\Keyspace1\Super1-712-Data.db INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240 reclaimed
Re: At what point does the cluster get faster than the individual nodes?
Right it's a similar concept to DB sharding where you spread the write load around to different DB servers but won't necessarily increase the throughput of an one DB server but rather collectively. On Wed, Apr 21, 2010 at 12:16 PM, Mike Gallamore mike.e.gallam...@googlemail.com wrote: Some people might be able to answer this better than me. However: with quorum consistency you have to communicate with n/2 + 1 where n is the replication factor nodes. So unless you are disk bound your real expense is going to be all those extra network latencies. I'd expect that you'll see a relatively flat throughput per thread once you reach the point that you aren't disk or CPU bound. That said the extra nodes mean if you should be able to handle more threads/connections at the same throughput on each thread/connection. So bigger cluster doesn't mean a single job goes faster necessarily, just that you can handle more jobs at the same time. On 04/21/2010 08:28 AM, Mark Jones wrote: I’m seeing a cluster of 4 (replication factor=2) to be about as slow overall as the barely faster than the slowest node in the group. When I run the 4 nodes individually, I see: For inserts: Two nodes @ 12000/second 1 node @ 9000/second 1 node @ 7000/second For reads: Abysmal, less than 1000/second (not range slices, individual lookups) Disk util @ 88+% How many nodes are required before you see a net positive gain on inserts and reads (QUORUM consistency on both)? When I use my 2 fastest nodes as a pair, the thruput is around 9000 inserts/second. What is a good to excellent hardware config for Cassandra? I have separate drives for data and commit log and 8GB in 3 machines (all dual core). My fastest insert node has 4GB and a triple core processor. I’ve run py_stress, and my C++ code beats it by several 1000 inserts/second toward the end of the runs, so I don’t think it is my app, and I’ve removed the super columns per some suggestions yesterday. When Cassandra is working, it performs well, the problem is that is frequently slows down to 50% of its peaks and occasionally slows down to 0 inserts/second which greatly reduces aggregate thruput.
Re: Just to be clear, cassandra is web framework agnostic b/c of Thrift?
I'll try to test this out tonight. On Wed, Apr 21, 2010 at 1:07 PM, Jonathan Ellis jbel...@gmail.com wrote: There is a patch attached to https://issues.apache.org/jira/browse/CASSANDRA-948 that needs volunteers to test. On Sun, Apr 18, 2010 at 11:13 PM, Mark Greene green...@gmail.com wrote: With the 0.6.0 release, the windows cassandra.bat file errors out. There's a bug filed for this already. There's a README or something similar in the install directory, that tells you the basic CLI operations and explains the basic data model. On Sun, Apr 18, 2010 at 11:23 PM, S Ahmed sahmed1...@gmail.com wrote: Interesting, I'm just finding windows to be a pain, particular starting up java apps. (I guess I just need to learn!) How exactly would you startup Cassandra on a windows machine? i.e when the server reboots, how will it run the java -jar cassandar ? On Sun, Apr 18, 2010 at 7:35 PM, Joe Stump j...@joestump.net wrote: On Apr 18, 2010, at 5:33 PM, S Ahmed wrote: Obviously if you run asp.net on windows, it is probably a VERY good idea to be running cassandra on a linux box. Actually, I'm not sure this is true. A few people have found Windows performs fairly well with Cassandra, if I recall correctly. Obviously, all of the testing and most of the bigger users are running on Linux though. --Joe
Re: CassandraLimitations
Hey Bill, Are you asking if there are limits in the context of a single node or a ring of nodes? On Wed, Apr 21, 2010 at 3:58 PM, Bill de hOra b...@dehora.net wrote: http://wiki.apache.org/cassandra/CassandraLimitations has good coverage on the limits around columns. Are there are design (or practical) limits to the number of rows a keyspace can have? Bill
Re: Just to be clear, cassandra is web framework agnostic b/c of Thrift?
With the 0.6.0 release, the windows cassandra.bat file errors out. There's a bug filed for this already. There's a README or something similar in the install directory, that tells you the basic CLI operations and explains the basic data model. On Sun, Apr 18, 2010 at 11:23 PM, S Ahmed sahmed1...@gmail.com wrote: Interesting, I'm just finding windows to be a pain, particular starting up java apps. (I guess I just need to learn!) How exactly would you startup Cassandra on a windows machine? i.e when the server reboots, how will it run the java -jar cassandar ? On Sun, Apr 18, 2010 at 7:35 PM, Joe Stump j...@joestump.net wrote: On Apr 18, 2010, at 5:33 PM, S Ahmed wrote: Obviously if you run asp.net on windows, it is probably a VERY good idea to be running cassandra on a linux box. Actually, I'm not sure this is true. A few people have found Windows performs fairly well with Cassandra, if I recall correctly. Obviously, all of the testing and most of the bigger users are running on Linux though. --Joe
Forced Failover Test for 0.6.0-RC1
Hi, I'm testing out failover for 0.6.0-RC1 and seeing varied behavior in Cassandra's ability to replay the commit log after a forced failure. My test is this: 1) Run ./cassandra -f 2) Insert a value through the CLI and immediately force a shutdown of cassandra after I see the Value inserted confirmation from the CLI. If I shutdown cassandra quickly, within a second or two of seeing Value Inserted the data seems to be lost after starting cassandra back up. I did a nodetool repair and flush as well just in case that may have had any affect. Conversely, if I waited about 5 seconds before shutting down cassandra, then started it again, I was able to retrieve the data from the CLI. Any ideas? Thanks in advance, Mark
Re: Forced Failover Test for 0.6.0-RC1
Ah ok. Sorry for the RTFM fail John ;-). For my test with a single node, batch would make sense if I needed better durability but with a cluster it's less of a concern with replication. Thanks. -Mark On Sat, Apr 10, 2010 at 11:49 AM, Jonathan Ellis jbel...@gmail.com wrote: http://wiki.apache.org/cassandra/Durability On Sat, Apr 10, 2010 at 10:38 AM, Mark Greene green...@gmail.com wrote: Hi, I'm testing out failover for 0.6.0-RC1 and seeing varied behavior in Cassandra's ability to replay the commit log after a forced failure. My test is this: 1) Run ./cassandra -f 2) Insert a value through the CLI and immediately force a shutdown of cassandra after I see the Value inserted confirmation from the CLI. If I shutdown cassandra quickly, within a second or two of seeing Value Inserted the data seems to be lost after starting cassandra back up. I did a nodetool repair and flush as well just in case that may have had any affect. Conversely, if I waited about 5 seconds before shutting down cassandra, then started it again, I was able to retrieve the data from the CLI. Any ideas? Thanks in advance, Mark
Re: Write consistency
So unless you re-try the write, the previous stale write stays on the other two nodes? Would a read repair fix this eventually? On Thu, Apr 8, 2010 at 11:36 AM, Avinash Lakshman avinash.laksh...@gmail.com wrote: What your describing is a distributed transaction? Generally strong consistency is always associated with doing transactional writes where you never see the results of a failed write on a subsequent read no matter what happens. Cassandra has no notion of rollback. That is why no combination will give you strong consistency. The idea is you re-try the failed write and eventually the system would have gotten rid of the previous stale write. Avinash On Thu, Apr 8, 2010 at 8:29 AM, Jeremy Dunck jdu...@gmail.com wrote: On Thu, Apr 8, 2010 at 7:16 AM, Gary Dusbabek gdusba...@gmail.com wrote: On Thu, Apr 8, 2010 at 02:55, Paul Prescod p...@ayogo.com wrote: In this¹ debate, there seemed to be consensus on the following fact: In Cassandra, say you use N=3, W=3 R=1. Let’s say you managed to only write to replicas A B, but not C. In this case Cassandra will return an error to the application saying the write failed- which is acceptable given than W=3. But Cassandra does not cleanup/rollback the writes that happened to A B. correct: no rolling back. Cassandra does go out of its way to make sure the cluster is healthy enough to begin the write though. I think the general answer here is don't use R=1 if you can't tolerate inconsistency? Still the point of confusion -- if W=3 and the write succeeds on 2 nodes but fails the 3rd, the write fails (to the updating client), but is the data on the two successful nodes still readable (i.e. reading from what was actually a failed write)?
Separate disks with cloud deployment
The FAQ page makes mention of using separate disks for the commit log and data directory. How would one go about achieving this in a cloud deployment such as Rackspace cloud servers or EC2 EBS? Or is it just preferred to use dedicated hardware to get the optimal performance? Thanks In Advance! Best, Mark