Re: two dimensional slicing
and compare them, but at this point I need to focus on one to get things working, so I'm trying to make a best initial guess. I would go for RP then, BOP may look like less work to start with but it *will* bite you later. If you use an increasing version number as a key you will get a hot spot. Get it working with RP and Standard CF's, accept the extra lookups, and then see if where you are performance / complexity wise. Cassandra can be pretty fast. I still don't really understand the problem, but I think you have many lists of names and when each list is updated you consider it a version. You then want to answer a query such as get all the names between foo and bar that were written to between version 100 and 200. Can this query can be re-written as get all the names between foo and bar that existed at version 200 and were created on or after version 100 ? Could you re-write the entire list every version update? CF: VersionedList row: list_name:version col_name: name col_value: last updated version So you slice one row at the upper version and discard all the columns where the value is less than the lower version ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/01/2012, at 5:31 AM, Bryce Allen wrote: Thanks, comments inline: On Mon, 23 Jan 2012 20:59:34 +1300 aaron morton aa...@thelastpickle.com wrote: It depends a bit on the data and the query patterns. * How many versions do you have ? We may have 10k versions in some cases, with up to a million names total in any given version but more often 10K. To manage this we are currently using two CFs, one for storing compacted complete lists and one for storing deltas on the compacted list. Based on usage, we will create a new compacted list and start writing deltas against that. We should be able to limit the number of deltas in a single row to below 100; I'd like to be able to keep it lower but I'm not sure we can maintain that under all load scenarios. The compacted lists are straightforward, but there are many ways to structure the deltas and they all have trade offs. A CF with composite columns that supported two dimensional slicing would be perfect. * How many names in each version ? We plan on limiting to a total of 1 million names, and around 10,000 per version (by limiting the batch size), but many deltas will have 10 names. * When querying do you know the versions numbers you want to query from ? How many are there normally? Currently we don't know the version numbers in advance - they are timestamps, and we are querying for versions less than or equal to the desired timestamp. We have talked about using vector clock versions and maintaining an index mapping time to version numbers, in which case we would know the exact versions after the index lookup, at the expense of another RTT on every operation. * How frequent are the updates and the reads ? We expect reads to be more frequent than writes. Unfortunately we don't have solid numbers on what to expect, but I would guess 20x. Update operations will involve several reads to determine where to write. I would lean towards using two standard CF's, one to list all the version numbers (in a single row probably) and one to hold the names in a particular version. To do your query slice the first CF and then run multi gets to the second. Thats probably not the best solution, if you can add some more info it may get better. I'm actually leaning back toward BOP, as I run into more issues and complexity with the RP models. I'd really like to implement both and compare them, but at this point I need to focus on one to get things working, so I'm trying to make a best initial guess. On 21/01/2012, at 6:20 AM, Bryce Allen wrote: I'm storing very large versioned lists of names, and I'd like to query a range of names within a given range of versions, which is a two dimensional slice, in a single query. This is easy to do using ByteOrderedPartitioner, but seems to require multiple (non parallel) queries and extra CFs when using RandomPartitioner. I see two approaches when using RP: 1) Data is stored in a super column family, with one dimension being the super column names and the other the sub column names. Since slicing on sub columns requires a list of super column names, a second standard CF is needed to get a range of names before doing a query on the main super CF. With CASSANDRA-2710, the same is possible using a standard CF with composite types instead of a super CF. 2) If one of the dimensions is small, a two dimensional slice isn't required. The data can be stored in a standard CF with linear ordering on a composite type (large_dimension, small_dimension). Data is queried based on the large dimension, and the client throws out the extra data in the other dimension. Neither of the above solutions are ideal. Does anyone else have
Re: Thift vs. CQL
Someone who knows more that me said the opposite and I did not want to appear to know more than I do. To be clear I don't know of any plans to kill of Thrift. It was late night opinion. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/01/2012, at 10:19 PM, bxqdev wrote: On 1/28/2012 12:41 PM, aaron morton wrote: Please disregard my (both are supported now and that situation may not last for ever. ) comment. ok, but why did you change you mind? Aaron - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/01/2012, at 9:34 PM, aaron morton wrote: Use a higher level client for your language http://wiki.apache.org/cassandra/ClientOptions and avoid the question. The different is mostly of concern to client writers. (both are supported now and that situation may not last for ever. ) Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com http://www.thelastpickle.com/ On 28/01/2012, at 1:26 AM, bxqdev wrote: Hello! Datastax's Cassandra documentation says that CQL API is the future of Cassandra API. It's also says that eventually Thift API will be removed completely. Is it true? Do you have any plans of removing Thift API, leaving CQL API only?? thanks.
Re: Restart cassandra every X days?
Yes but… For every upgrade read the NEWS.TXT it will go through the upgrade procedure in detail. If you want to feel extra smart scan through the CHANGES.txt to get an idea of whats going on. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 29/01/2012, at 4:14 AM, Maxim Potekhin wrote: Sorry if this has been covered, I was concentrating solely on 0.8x -- can I just d/l 1.0.x and continue using same data on same cluster? Maxim On 1/28/2012 7:53 AM, R. Verlangen wrote: Ok, seems that it's clear what I should do next ;-) 2012/1/28 aaron morton aa...@thelastpickle.com There are no blockers to upgrading to 1.0.X. A - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/01/2012, at 7:48 AM, R. Verlangen wrote: Ok. Seems that an upgrade might fix these problems. Is Cassandra 1.x.x stable enough to upgrade for, or should we wait for a couple of weeks? 2012/1/27 Edward Capriolo edlinuxg...@gmail.com I would not say that issuing restart after x days is a good idea. You are mostly developing a superstition. You should find the source of the problem. It could be jmx or thrift clients not closing connections. We don't restart nodes on a regiment they work fine. On Thursday, January 26, 2012, Mike Panchenko m...@mihasya.com wrote: There are two relevant bugs (that I know of), both resolved in somewhat recent versions, which make somewhat regular restarts beneficial https://issues.apache.org/jira/browse/CASSANDRA-2868 (memory leak in GCInspector, fixed in 0.7.9/0.8.5) https://issues.apache.org/jira/browse/CASSANDRA-2252 (heap fragmentation due to the way memtables used to be allocated, refactored in 1.0.0) Restarting daily is probably too frequent for either one of those problems. We usually notice degraded performance in our ancient cluster after ~2 weeks w/o a restart. As Aaron mentioned, if you have plenty of disk space, there's no reason to worry about cruft sstables. The size of your active set is what matters, and you can determine if that's getting too big by watching for iowait (due to reads from the data partition) and/or paging activity of the java process. When you hit that problem, the solution is to 1. try to tune your caches and 2. add more nodes to spread the load. I'll reiterate - looking at raw disk space usage should not be your guide for that. Forcing a gc generally works, but should not be relied upon (note suggest in http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#gc()). It's great news that 1.0 uses a better mechanism for releasing unused sstables. nodetool compact triggers a major compaction and is no longer a recommended by datastax (details here http://www.datastax.com/docs/1.0/operations/tuning#tuning-compaction bottom of the page). Hope this helps. Mike. On Wed, Jan 25, 2012 at 5:14 PM, aaron morton aa...@thelastpickle.com wrote: That disk usage pattern is to be expected in pre 1.0 versions. Disk usage is far less interesting than disk free space, if it's using 60 GB and there is 200GB thats ok. If it's using 60Gb and there is 6MB free thats a problem. In pre 1.0 the compacted files are deleted on disk by waiting for the JVM do decide to GC all remaining references. If there is not enough space (to store the total size of the files it is about to write or compact) on disk GC is forced and the files are deleted. Otherwise they will get deleted at some point in the future. In 1.0 files are reference counted and space is freed much sooner. With regard to regular maintenance, node tool cleanup remvos data from a node that it is no longer a replica for. This is only of use when you have done a token move. I would not recommend a daily restart of the cassandra process. You will lose all the run time optimizations the JVM has made (i think the mapped files pages will stay resident). As well as adding additional entropy to the system which must be repaired via HH, RR or nodetool repair. If you want to see compacted files purged faster the best approach would be to upgrade to 1.0. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/01/2012, at 9:51 AM, R. Verlangen wrote: In his message he explains that it's for Forcing a GC . GC stands for garbage collection. For some more background see: http://en.wikipedia.org/wiki/Garbage_collection_(computer_science) Cheers! 2012/1/25 mike...@thomsonreuters.com Karl, Can you give a little more details on these 2 lines, what do they do? java -jar cmdline-jmxclient-0.10.3.jar - localhost:8080 java.lang:type=Memory gc Thank you, Mike -Original Message- From: Karl Hiramoto [mailto:k...@hiramoto.org]
RE: rpc_address: 0.0.0.0
If the code in the 0.8 branch is reflective of what is actually included in Cassandra 0.8.9 (here: http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java) then the problem is that line 202 is doing an == comparison on strings. The correct way to compare would be endpoint_address.equals(0.0.0.0) instead. - Mike From: Patrik Modesto [patrik.mode...@gmail.com] Sent: Thursday, January 26, 2012 5:34 AM To: user@cassandra.apache.org Subject: rpc_address: 0.0.0.0 Hi, #using cassandra 0.8.9 I used to have rpc_address set to 0.0.0.0 to bind cassandra to all interfaces. After upgrading out Hadoop jobs to cassandra 0.8.9 (from 0.8.7) there are lots of these messages, and the jobs fails. 12/01/26 11:15:21 DEBUG hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0 java.io.IOException: unable to connect to server at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused at org.apache.thrift.transport.TSocket.open(TSocket.java:183) at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385) ... 9 more Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at org.apache.thrift.transport.TSocket.open(TSocket.java:178) ... 11 more ... Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints 10.0.18.129,10.0.18.99,10.0.18.98 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156) ... 19 more Caused by: java.io.IOException: failed connecting to all endpoints 10.0.18.129,10.0.18.99,10.0.18.98 at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193) at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) describe_ring returns: endpoints: 10.0.18.129,10.0.18.99,10.0.18.98 rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0 I've found CASSANDRA-3214 but there is ... then fall back to the gossip endpoint if we don't get what we want. But the job just fails. The job is run from outside of the cluster, which is valid. When I run the job from one of the nodes, there is of course cassandra on 0.0.0.0 and the job runs. Where can be the problem? Regards, P.
Any tools like phpMyAdmin to see data stored in Cassandra ?
I have tried Sebastien's phpmyAdmin For Cassandrahttps://github.com/sebgiroux/Cassandra-Cluster-Admin to see the data stored in Cassandra in the same manner as phpMyAdmin allows. But since it makes assumptions about the datatypes of the column name/column value doesn't allow to configure the datatype data should be read as on per cf basis, I couldn't make the best use of it. Are there any similar other tools out there that can do the job better ?
RE: Any tools like phpMyAdmin to see data stored in Cassandra ?
OpsCenter? http://www.datastax.com/products/opscenter - Mike From: rajkumar@gmail.com [rajkumar@gmail.com] on behalf of Ertio Lew [ertio...@gmail.com] Sent: Sunday, January 29, 2012 4:26 PM To: user Subject: Any tools like phpMyAdmin to see data stored in Cassandra ? I have tried Sebastien's phpmyAdmin For Cassandrahttps://github.com/sebgiroux/Cassandra-Cluster-Admin to see the data stored in Cassandra in the same manner as phpMyAdmin allows. But since it makes assumptions about the datatypes of the column name/column value doesn't allow to configure the datatype data should be read as on per cf basis, I couldn't make the best use of it. Are there any similar other tools out there that can do the job better ?
RE: SQL DB Integration
Hi Viktor, Thanks for the comments. True, the characteristics that I outlined were general, just to give a background/context to the problem I’m trying to solve. Will address more specific questions when it comes to designing and implementing the data storage solution and the API to do the integration of (1) – (3) above. Given that our data mining application (IBM SPSS Modeler), our partner platform (Oracle DB data model), used for additional services and our clients’ DBs are all based on SQL, from your experience: (1) Is it a good idea to use Cassandra as a storage solution for SQL data, converted to the NoSQL data model just to be stored on Cassandra? (2) Do you know of any similar cases of using Cassandra as a storage, supporting SQL data applications, or perhaps data model architecture differences and high development costs make no sense for this? (3) If using Cassandra as a storage, supporting SQL data applications is not a good idea, do you recommend an alternative SQL cloud DB solution that has good scalability? Thanks and regards, Krassimir Kostov
Re: Any tools like phpMyAdmin to see data stored in Cassandra ?
On Mon, Jan 30, 2012 at 7:16 AM, Frisch, Michael michael.fri...@nuance.comwrote: OpsCenter? http://www.datastax.com/products/opscenter - Mike I have tried Sebastien's phpmyAdmin For Cassandrahttps://github.com/sebgiroux/Cassandra-Cluster-Admin to see the data stored in Cassandra in the same manner as phpMyAdmin allows. But since it makes assumptions about the datatypes of the column name/column value doesn't allow to configure the datatype data should be read as on per cf basis, I couldn't make the best use of it. Are there any similar other tools out there that can do the job better ? Thanks, that's a great product but unfortunately doesn't work with windows. Any tools for windows ?
Re: two dimensional slicing
On Sun, Jan 29, 2012 at 7:26 PM, aaron morton aa...@thelastpickle.comwrote: and compare them, but at this point I need to focus on one to get things working, so I'm trying to make a best initial guess. I would go for RP then, BOP may look like less work to start with but it *will* bite you later. If you use an increasing version number as a key you will get a hot spot. Get it working with RP and Standard CF's, accept the extra lookups, and then see if where you are performance / complexity wise. Cassandra can be pretty fast. Of course, there is no guarantee that it will bite you. Whatever data hotspot you may get may very well be minor vs. the advantage of slicing continous blocks of data on a single server vs. random bits and pieces all over the place. For instance, there are many large data repositories out there of analytic data which only have a few queries per hour. BOP will most likely have no performance at all for many of these, indeed, it may be much faster than the alternatives. BOP is very useful and powerful for many things and saves a fair chunk of development time vs. the alternatives when you can use it. If we really want everybody to stop using it, we should change cassandra so it by default can provide the same function in some other way without adding days and maybe weeks of development and extra complexity to your project. Terje
Re: rpc_address: 0.0.0.0
Thanks! I've created ticked https://issues.apache.org/jira/browse/CASSANDRA-3811 Regards, P. On Sun, Jan 29, 2012 at 20:00, Frisch, Michael michael.fri...@nuance.com wrote: If the code in the 0.8 branch is reflective of what is actually included in Cassandra 0.8.9 (here: http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java) then the problem is that line 202 is doing an == comparison on strings. The correct way to compare would be endpoint_address.equals(0.0.0.0) instead. - Mike