JDBC connectivity for Cassandra?
Hi Is their any official JDBC connectivity for Datastax Cassandra (or Cassandra in general). I am looking to fire insert into cassandra like that I can do in CQL. Thanks
BATCH consistency level VS. individual statement consistency level
Hi, I’m not sure how consistency level is applied on batch statement. I didn’t found detailed information on datastax.com (1) http://docs.datastax.com/en/cql/3.0/cql/cql_reference/batch_r.html regarding that. - It is possible to set a CL on individual statements. - It is possible to set a CL on the whole batch. - If CL is not set then the default CL is used (2) http://docs.datastax.com/en/cql/3.0/cql/cql_reference/consistency_r.html which is ONE (3) http://docs.datastax.com/en/cassandra/1.2/cassandra/dml/dml_config_consistency_c.html?scroll=concept_ds_umf_5xx_zj__configuring-client-consistency-levels I understand that if the CL is only set on the batch, then this CL will be applied to individual statements. But how are applied consistency levels in these cases ? 1. when consistency levels differ between statements and batch : BEGIN BATCH USING CONSISTENCY QUORUM INSERT INTO users (...) VALUES (...) USING CONSISTENCY LOCAL_QUORUM UPDATE users SET password = '...' WHERE userID = '...' USING CONSISTENCY LOCAL_ONE ... APPLY BATCH; 2. when consistency level is set on statement but not on batch (will it be overridden to default consistency level, i.e. ONE ?) BEGIN BATCH INSERT INTO users (...) VALUES (...) USING CONSISTENCY LOCAL_QUORUM UPDATE users SET password = '...' WHERE userID = '...' USING CONSISTENCY LOCAL_ONE ... APPLY BATCH; Cheers, — Brice
Example Data Modelling
Hi, I have basic doubt: I have an RDBMS with the following two tables: Emp - EmpID, FN, LN, Phone, Address Sal - Month, Empid, Basic, Flexible Allowance My use case is to print the Salary slip at the end of each month and the slip contains emp name and his other details. Now, if I want to have the same in cassandra, I will have a single cf with emp personal details and his salary details. Is this the right approach? Should we have the employee personal details duplicated each month? Regards, Seenu.
Re: Compaction issues, 2.0.12
Thanks Rob. 1) Cassandra version is 2.0.12. 2) Interesting. Looking at JMX org.apache.cassandra.db - ColumnFamilies - trackcontent - track_content - Attributes, I get: LiveDiskSpaceUsed: 17788740448, i.e. ~17GB LiveSSTableCount: 3 TotalDiskSpaceUsed: 55714084629, i.e. ~55GB So it obviously knows about the extra disk space even though the live space looks correct. I couldn't find anything to identify the actual files though. 3) So that was even more interesting. After restarting the cassandra daemon, the sstables were not deleted and now the same JMX attributes are: LiveDiskSpaceUsed: 55681040579, i.e. ~55GB LiveSSTableCount: 8 TotalDiskSpaceUsed: 55681040579, i.e. ~55GB So some of my non-live tables are back live again, and obviously some of the big ones!! Jef On 6 July 2015 at 20:26, Robert Coli rc...@eventbrite.com wrote: On Mon, Jul 6, 2015 at 7:25 AM, Jeff Williams je...@wherethebitsroam.com wrote: So it appears that the 10 sstables that were compacted to /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57372 are still sitting around, with a couple of them being very large! Any ideas for what I can check? Can I just delete the old sstables? (I'm guessing not). 1) What version of Cassandra? 2) There are JMX methods in some versions to identify which files are live in the data dir, you could try to use those to see if those files are live. Does load seem to show that they are live from the perspective of Cassandra? 3) If they are not live, restarting the Cassandra daemon on the node will likely result in them being deleted. =Rob
Re: What are problems with schema disagreement
On Mon, Jul 6, 2015 at 1:30 PM, John Wong gokoproj...@gmail.com wrote: But is there a problem with letting schema disagreement running for a long time? It depends on what the nature of the desynch is, but generally speaking there may be. If you added a column or a columnfamily, and one node didn't get that update, it will except when your clients attempt to read that column/columnfamily. And so on... =Rob
Re: Experiencing Timeouts on one node
When we reboot the problematic node, we see the following errors in system.log. 1. Does this mean hints column family is corrupted? 2. Can we scrub system column family on problematic node and its replication partners? 3. How do we rebuild System keyspace? == ERROR [CompactionExecutor:950] 2015-06-27 20:11:44,595 CassandraDaemon.java (line 191) Exception in thread Thread[CompactionExecutor:950,1,main] java.lang.AssertionError: originally calculated column size of 8684 but now it is 15725 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) at org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(CompactionManager.java:442) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ERROR [HintedHandoff:552] 2015-06-27 20:11:44,595 CassandraDaemon.java (line 191) Exception in thread Thread[HintedHandoff:552,1,main] java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 8684 but now it is 15725 at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:436) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:282) at org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:90) at org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:502) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 8684 but now it is 15725 at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source) at java.util.concurrent.FutureTask.get(Unknown Source) at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:432) ... 6 more Caused by: java.lang.AssertionError: originally calculated column size of 8684 but now it is 15725 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) at org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(CompactionManager.java:442) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) == On Wed, Jul 1, 2015 at 11:59 AM, Shashi Yachavaram shashi...@gmail.com wrote: We have a 28 node cluster, out of which only one node is experiencing timeouts. We thought it was the raid, but there are two other nodes on the same raid without any problem. Also The problem goes away if we reboot the node, and then reappears after seven days. The following hinted hand-off timeouts are seen on the node experiencing the timeouts. Also we did not notice any gossip errors. I was wondering if anyone has seen this issue and how they resolved it. Cassandra Version: 1.2.15.1 OS: Linux cm 2.6.32-504.8.1.el6.x86_64 #1 SMP Fri Dec 19 12:09:25 EST 2014 x86_64 x86_64 x86_64 GNU/Linux java version 1.6.0_85 INFO [HintedHandoff:2] 2015-06-17
Re: Compaction issues, 2.0.12
On Mon, Jul 6, 2015 at 1:46 PM, Jeff Williams je...@wherethebitsroam.com wrote: 1) Cassandra version is 2.0.12. 2) Interesting. Looking at JMX org.apache.cassandra.db - ColumnFamilies - trackcontent - track_content - Attributes, I get: LiveDiskSpaceUsed: 17788740448, i.e. ~17GB LiveSSTableCount: 3 TotalDiskSpaceUsed: 55714084629, i.e. ~55GB So it obviously knows about the extra disk space even though the live space looks correct. I couldn't find anything to identify the actual files though. That's what I would expect. 3) So that was even more interesting. After restarting the cassandra daemon, the sstables were not deleted and now the same JMX attributes are: LiveDiskSpaceUsed: 55681040579, i.e. ~55GB LiveSSTableCount: 8 TotalDiskSpaceUsed: 55681040579, i.e. ~55GB So some of my non-live tables are back live again, and obviously some of the big ones!! This is permanently fatal to consistency; sorry if I was not clear enough that if they were not live, there was some risk of Cassandra considering them live again upon restart. If I were you, I would either stop the node and remove the files you know shouldn't be live or do a major compaction ASAP. The behavior you just encountered sounds like a bug, and it is a rather serious one. SSTables which should be dead being marked live is very bad for consistency. Do you see any exceptions in your logs or anything? If you can repro, you should file a JIRA ticket with the apache project... =Rob
Re: What are problems with schema disagreement
Thanks. Yeah we typically restart the nodes in the minor version to force resync. But is there a problem with letting schema disagreement running for a long time? Thanks. John On Mon, Jul 6, 2015 at 2:29 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Jul 2, 2015 at 9:31 PM, John Wong gokoproj...@gmail.com wrote: Hi Graham. Thanks. We are still running on 1.2.16, but we do plan to upgrade in the near future. The load on the cluster at the time was very very low. All nodes were responsive, except nothing was show up in the logs after certain time, which led me to believe something happened internal, although that was a poor wild guess. But is it safe to be okay with schema disagreement? I worry about data consistency if I let it sit too long. In general one shouldn't run with schema disagreement persistently. I've seen schema desynch issues on 1.2.x, in general restarting some unclear subset of the affected daemons made them synch. =Rob
Wrong peers
Hey guys, I'm using Ruby driver( http://datastax.github.io/ruby-driver/ ) for backup scripts. I tried to discover all peers and got wrong peers that are different with nodetool status. = Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.40.231.53 1.18 TB 256 ? b2d877d7-f031-4190-8569-976bb0ce034f RACK01 UN 10.40.231.11 1.24 TB 256 ? e15cda1c-65cc-40cb-b85c-c4bd665d02d7 RACK01 cqlsh use system; cqlsh:system select peer from system.peers; peer -- 10.40.231.31 10.40.231.53 (2 rows) What to do with these old peers, whether they can be removed without consequences since they are not in production cluster? And how to keep up to date the peers? -- Anton Koshevoy
Re: Wrong peers
Anton, I have also seen this issue with decommissioned nodes remaining in the system.peers table. On the bright side, they can be safely removed from the system.peers table without issue. You will have to check every node in the cluster since this is a local setting per node. Jeff On 6 July 2015 at 22:45, nowarry nowa...@gmail.com wrote: Hey guys, I'm using Ruby driver( http://datastax.github.io/ruby-driver/ ) for backup scripts. I tried to discover all peers and got wrong peers that are different with nodetool status. = Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 10.40.231.53 1.18 TB256 ? b2d877d7-f031-4190-8569-976bb0ce034f RACK01 UN 10.40.231.11 1.24 TB256 ? e15cda1c-65cc-40cb-b85c-c4bd665d02d7 RACK01 cqlsh use system; cqlsh:system select peer from system.peers; peer -- 10.40.231.31 10.40.231.53 (2 rows) What to do with these old peers, whether they can be removed without consequences since they are not in production cluster? And how to keep up to date the peers? -- Anton Koshevoy
Re: Compaction issues, 2.0.12
On Mon, Jul 6, 2015 at 7:25 AM, Jeff Williams je...@wherethebitsroam.com wrote: So it appears that the 10 sstables that were compacted to /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57372 are still sitting around, with a couple of them being very large! Any ideas for what I can check? Can I just delete the old sstables? (I'm guessing not). 1) What version of Cassandra? 2) There are JMX methods in some versions to identify which files are live in the data dir, you could try to use those to see if those files are live. Does load seem to show that they are live from the perspective of Cassandra? 3) If they are not live, restarting the Cassandra daemon on the node will likely result in them being deleted. =Rob
Re: What are problems with schema disagreement
On Thu, Jul 2, 2015 at 9:31 PM, John Wong gokoproj...@gmail.com wrote: Hi Graham. Thanks. We are still running on 1.2.16, but we do plan to upgrade in the near future. The load on the cluster at the time was very very low. All nodes were responsive, except nothing was show up in the logs after certain time, which led me to believe something happened internal, although that was a poor wild guess. But is it safe to be okay with schema disagreement? I worry about data consistency if I let it sit too long. In general one shouldn't run with schema disagreement persistently. I've seen schema desynch issues on 1.2.x, in general restarting some unclear subset of the affected daemons made them synch. =Rob
Question on 'Average tombstones per slice' when running cfstats
Hello, I cannot find documentation on the last two parameters given by cfstats below. It looks like most examples show 0 for these metrics, but our table has large numbers. What do these mean? Read Count: 60601817 Read Latency: 6.107321156707232 ms. Write Count: 185864222 Write Latency: 0.04878640954362911 ms. Pending Tasks: 0 Table: product_v1 SSTable count: 18 Space used (live), bytes: 271185720329 Space used (total), bytes: 271202831486 SSTable Compression Ratio: 0.16228696654682392 Number of keys (estimate): 17304960 Memtable cell count: 378 Memtable data size, bytes: 47468777 Memtable switch count: 152949 Local read count: 60601868 Local read latency: 10.945 ms Local write count: 185864292 Local write latency: 0.089 ms Pending tasks: 0 Bloom filter false positives: 6430 Bloom filter false ratio: 0.00817 Bloom filter space used, bytes: 50549840 Compacted partition minimum bytes: 25 Compacted partition maximum bytes: 30130992 Compacted partition mean bytes: 107415 Average live cells per slice (last five minutes): 2.0 Average tombstones per slice (last five minutes): 60.0 [cid:1D2F8CCF-E72F-408A-9B3A-C03A1990395E] Venky Kandaswamy @WalmartLabs
Re: Question on 'Average tombstones per slice' when running cfstats
On Mon, Jul 6, 2015 at 4:19 PM, Venkatesh Kandaswamy ve...@walmartlabs.com wrote: I cannot find documentation on the last two parameters given by cfstats below. It looks like most examples show 0 for these metrics, but our table has large numbers. What do these mean? Average live cells per slice (last five minutes): 2.0 Average tombstones per slice (last five minutes): 60.0 They mean what they say they mean? For each slice (which is a weird legacy thriftism you should be able to parse as request in this context) there was an average of 2 un-masked (live) cells read and 60 cells which were masked (tombstones). This is a pretty bad ratio, and suggests that your design does UPDATE-like operations, which are not being merged fast enough by compaction to avoid you reading past them. =Rob
Re: Compaction issues, 2.0.12
I’ve seen the same thing: https://issues.apache.org/jira/browse/CASSANDRA-9577 https://issues.apache.org/jira/browse/CASSANDRA-9577 I’ve had cases where a restart clears the old tables, and I’ve had cases where a restart considers the old tables to be live. On Jul 6, 2015, at 1:51 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Jul 6, 2015 at 1:46 PM, Jeff Williams je...@wherethebitsroam.com mailto:je...@wherethebitsroam.com wrote: 1) Cassandra version is 2.0.12. 2) Interesting. Looking at JMX org.apache.cassandra.db - ColumnFamilies - trackcontent - track_content - Attributes, I get: LiveDiskSpaceUsed: 17788740448 tel:17788740448, i.e. ~17GB LiveSSTableCount: 3 TotalDiskSpaceUsed: 55714084629, i.e. ~55GB So it obviously knows about the extra disk space even though the live space looks correct. I couldn't find anything to identify the actual files though. That's what I would expect. 3) So that was even more interesting. After restarting the cassandra daemon, the sstables were not deleted and now the same JMX attributes are: LiveDiskSpaceUsed: 55681040579, i.e. ~55GB LiveSSTableCount: 8 TotalDiskSpaceUsed: 55681040579, i.e. ~55GB So some of my non-live tables are back live again, and obviously some of the big ones!! This is permanently fatal to consistency; sorry if I was not clear enough that if they were not live, there was some risk of Cassandra considering them live again upon restart. If I were you, I would either stop the node and remove the files you know shouldn't be live or do a major compaction ASAP. The behavior you just encountered sounds like a bug, and it is a rather serious one. SSTables which should be dead being marked live is very bad for consistency. Do you see any exceptions in your logs or anything? If you can repro, you should file a JIRA ticket with the apache project... =Rob
Re: Wrong peers
There is a bug in Jira related to this, it is not a driver issue, is a Cassandra issue. It is solved on 2.0.14 I think. I will post the ticket once I find it. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Jul 6, 2015 at 10:50 PM, Jeff Williams je...@wherethebitsroam.com wrote: Anton, I have also seen this issue with decommissioned nodes remaining in the system.peers table. On the bright side, they can be safely removed from the system.peers table without issue. You will have to check every node in the cluster since this is a local setting per node. Jeff On 6 July 2015 at 22:45, nowarry nowa...@gmail.com wrote: Hey guys, I'm using Ruby driver( http://datastax.github.io/ruby-driver/ ) for backup scripts. I tried to discover all peers and got wrong peers that are different with nodetool status. = Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 10.40.231.53 1.18 TB256 ? b2d877d7-f031-4190-8569-976bb0ce034f RACK01 UN 10.40.231.11 1.24 TB256 ? e15cda1c-65cc-40cb-b85c-c4bd665d02d7 RACK01 cqlsh use system; cqlsh:system select peer from system.peers; peer -- 10.40.231.31 10.40.231.53 (2 rows) What to do with these old peers, whether they can be removed without consequences since they are not in production cluster? And how to keep up to date the peers? -- Anton Koshevoy -- --
Re: Question on 'Average tombstones per slice' when running cfstats
Wouldn't it suggest a delete heavy workload, rather than update? On Mon, Jul 6, 2015 at 5:21 PM Robert Coli rc...@eventbrite.com wrote: On Mon, Jul 6, 2015 at 4:19 PM, Venkatesh Kandaswamy ve...@walmartlabs.com wrote: I cannot find documentation on the last two parameters given by cfstats below. It looks like most examples show 0 for these metrics, but our table has large numbers. What do these mean? Average live cells per slice (last five minutes): 2.0 Average tombstones per slice (last five minutes): 60.0 They mean what they say they mean? For each slice (which is a weird legacy thriftism you should be able to parse as request in this context) there was an average of 2 un-masked (live) cells read and 60 cells which were masked (tombstones). This is a pretty bad ratio, and suggests that your design does UPDATE-like operations, which are not being merged fast enough by compaction to avoid you reading past them. =Rob
Compaction issues, 2.0.12
Hi, I have a keyspace which is using about 19GB on most nodes in the cluster. However on one node the data directory for the only table in that keyspace now uses 52GB. It seems that old sstables are not removed after compaction. Here is a view of the data directory Data files before compaction (I didn't think to check sizes at the time): $ ls /var/lib/cassandra/data/trackcontent/track_content/*Data.db trackcontent-track_content-jb-30852-Data.db trackcontent-track_content-jb-30866-Data.db trackcontent-track_content-jb-37136-Data.db trackcontent-track_content-jb-43371-Data.db trackcontent-track_content-jb-44676-Data.db trackcontent-track_content-jb-51845-Data.db trackcontent-track_content-jb-55339-Data.db trackcontent-track_content-jb-57353-Data.db trackcontent-track_content-jb-57366-Data.db trackcontent-track_content-jb-57367-Data.db trackcontent-track_content-jb-57368-Data.db Then when compaction was run I got the following in the logs: INFO [CompactionExecutor:3546] 2015-07-06 12:01:04,246 CompactionTask.java (line 120) Compacting [ SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-30852-Data.db'), SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-30866-Data.db'), SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-37136-Data.db'), SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-43371-Data.db'), SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-44676-Data.db'), SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-51845-Data.db'), SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-55339-Data.db'), SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57353-Data.db') SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57366-Data.db'), SSTableReader(path='/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57371-Data.db'), ] INFO [CompactionExecutor:3546] 2015-07-06 13:34:31,458 CompactionTask.java (line 296) Compacted 10 sstables to [/var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57372,]. 36,182,766,725 bytes to 16,916,868,412 (~46% of original) in 5,607,211ms = 2.877221MB/s. 84,600,535 total partitions merged to 40,835,912. Partition merge counts were {1:864528, 2:36318841, 3:3523852, 4:124091, 5:6051, 6:25, } But after compaction completes, the data directory still looks like: $ ls -l /var/lib/cassandra/data/trackcontent/track_content/*Data.db -rw-r--r-- 1 cassandra cassandra 16642164891 Jun 29 06:55 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-30852-Data.db -rw-r--r-- 1 cassandra cassandra 17216513377 Jun 30 08:36 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-30866-Data.db -rw-r--r-- 1 cassandra cassandra 813683923 Jun 30 12:03 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-37136-Data.db -rw-r--r-- 1 cassandra cassandra 855070477 Jun 30 13:15 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-43371-Data.db -rw-r--r-- 1 cassandra cassandra 209921645 Jun 30 13:28 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-44676-Data.db -rw-r--r-- 1 cassandra cassandra 213532360 Jul 2 03:16 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-51845-Data.db -rw-r--r-- 1 cassandra cassandra14763933 Jul 2 07:58 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-52921-Data.db -rw-r--r-- 1 cassandra cassandra16955005 Jul 2 08:09 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-53037-Data.db -rw-r--r-- 1 cassandra cassandra61322156 Jul 2 23:34 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-55339-Data.db -rw-r--r-- 1 cassandra cassandra61898096 Jul 4 01:02 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57353-Data.db -rw-r--r-- 1 cassandra cassandra75912668 Jul 5 21:23 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57366-Data.db -rw-r--r-- 1 cassandra cassandra32747132 Jul 6 11:01 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57371-Data.db -rw-r--r-- 1 cassandra cassandra 16916868412 Jul 6 13:34 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57372-Data.db -rw-r--r-- 1 cassandra cassandra 8392714 Jul 6 13:24 /var/lib/cassandra/data/trackcontent/track_content/trackcontent-track_content-jb-57373-Data.db So it appears that the 10 sstables that were
Re: Example Data Modelling
Hi Srinivasa, I think you're right, In Cassandra you should favor denormalisation when in RDBMS you find a relationship like this. I'd suggest a cf like this CREATE TABLE salaries ( EmpID varchar, FN varchar, LN varchar, Phone varchar, Address varchar, month integer, basic integer, flexible_allowance float, PRIMARY KEY(EmpID, month) ) That way the salaries will be partitioned by EmpID and clustered by month, which I guess is the natural sorting you want. Hope it helps, Cheers! Carlos Alonso | Software Engineer | @calonso https://twitter.com/calonso On 6 July 2015 at 13:01, Srinivasa T N seen...@gmail.com wrote: Hi, I have basic doubt: I have an RDBMS with the following two tables: Emp - EmpID, FN, LN, Phone, Address Sal - Month, Empid, Basic, Flexible Allowance My use case is to print the Salary slip at the end of each month and the slip contains emp name and his other details. Now, if I want to have the same in cassandra, I will have a single cf with emp personal details and his salary details. Is this the right approach? Should we have the employee personal details duplicated each month? Regards, Seenu.
Re: Bootstrap code
For 2.1- https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/dht/BootStrapper.java#L61 All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Fri, Jul 3, 2015 at 10:56 AM, Bill Hastings bllhasti...@gmail.com wrote: Hi All Can someone please point me to where the code for bootstrapping a new node exists? -- Cheers Bill