Re: Timeuuid inserted with now(), how to get the value back in Java client?
no, there's no way. you should generate the TIMEUUID on the client side so that you have it. T# On Sat, Mar 29, 2014 at 1:01 AM, Andy Atj2 andya...@gmail.com wrote: I'm writing a Java client to a Cassandra db. One of the main primary keys is a timeuuid. I plan to do INSERTs using now() and have Cassandra generate the value of the timeuuid. After the INSERT, I need the Cassandra-generated timeuuid value. Is there an easy wsay to get it, without having to re-query for the record I just inserted, hoping to get only one record back? Remember, I don't have the PK. Eg, in every other db there's a way to get the generated PK back. In sql it's @@identity, in oracle its...etc etc. I know Cassandra is not an RDBMS. All I want is the value Cassandra just generated. Thanks, Andy
Re: Meaning of token column in system.peers and system.local
your assumption about 256 tokens per node is correct. as for you second question, it seems to me like most of your assumptions are correct, but I'm not sure I understand them correctly. hopefully someone else can answer this better. tokens are a property of the cluster and not the keyspace. the first replica of any token will be the same for all keyspaces, but with different replication factors the other replicas will differ. when you query the system.local and system.peers tables you must make sure that you don't connect to other nodes. I think the inconsistency you think you found is because the first and second queries went to different nodes. the java driver will connect to all nodes and load balance requests by default. T# On Mon, Mar 31, 2014 at 4:06 AM, Clint Kelly clint.ke...@gmail.com wrote: BTW one other thing that I have not been able to debug today that maybe someone can help me with: I am using a three-node Cassandra cluster with Vagrant. The nodes in my cluster are 192.168.200.11, 192.168.200.12, and 192.168.200.13. If I use cqlsh to connect to 192.168.200.11, I see unique sets of tokens when I run the following three commands: select tokens from system.local select tokens from system.peers where peer=192.168.200.12 select tokens from system.peers where peer=192.168.200.13 This is what I expect. However, when I tried making an application with the Java driver that does the following: - Create a Session by connecting to 192.168.200.11 - From that session, select tokens from system.local - From that session, select tokens, peer from system.peers Now I get the exact-same set of tokens from system.local and from the row in system.peers in which peer=192.168.200.13. Anyone have any idea why this would happen? I'm not sure how to debug this. I see the following log from the Java driver: 14/03/30 19:05:24 DEBUG com.datastax.driver.core.Cluster: Starting new cluster with contact points [/192.168.200.11] 14/03/30 19:05:24 INFO com.datastax.driver.core.Cluster: New Cassandra host /192.168.200.13 added 14/03/30 19:05:24 INFO com.datastax.driver.core.Cluster: New Cassandra host /192.168.200.12 added I'm running Cassandra 2.0.6 in the virtual machine and I built my application with version 2.0.1 of the driver. Best regards, Clint On Sun, Mar 30, 2014 at 4:51 PM, Clint Kelly clint.ke...@gmail.comwrote: Hi all, I am working on a Hadoop InputFormat implementation that uses only the native protocol Java driver and not the Thrift API. I am currently trying to replicate some of the behavior of *Cassandra.client.describe_ring(myKeyspace)* from the Thrift API. I would like to do the following: - Get a list of all of the token ranges for a cluster - For every token range, determine the replica nodes on which the data in the token range resides - Estimate the number of rows for every range of tokens - Groups ranges of tokens on common replica nodes such that we can create a set of input splits for Hadoop with total estimated line counts that are reasonably close to the requested split size Last week I received some much-appreciated help on this list that pointed me to using the system.peers table to get the list of token ranges for the cluster and the corresponding hosts. Today I created a three-node C* cluster in Vagrant (https://github.com/dholbrook/vagrant-cassandra) and tried inspecting some of the system tables. I have a couple of questions now: 1. *How many total unique tokens should I expect to see in my cluster?* If I have three nodes, and each node has a cassandra.yaml with num_tokens = 256, then should I expect a total of 256*3 = 768 distinct vnodes? 2. *How does the creation of vnodes and their assignment to nodes relate to the replication factor for a given keyspace?* I never thought about this until today, and I tried to reread the documentation on virtual nodes, replication in Cassandra, etc., and now I am sadly still confused. Here is what I think I understand. :) - Given a row with a partition key, any client request for an operation on that row will go to a coordinator node in the cluster. - The coordinator node will compute the token value for the row and from that determine a set of replica nodes for that token. - One of the replica nodes I assume is the node that owns the vnode with the token range that encompasses the token - The identity of the owner of this virtual node is a cross-keyspace property - And the other replicas were originally chosen based on the replica-placement strategy - And therefore the other replicas will be different for each keyspace (because replication factors and replica-placement strategy are properties of a keyspace) 3. What do the values in the token column in system.peers and system.local refer to then? - Since these tables appear to be global, and
Re: Production Quality Ruby Driver?
I'm the author of cql-rb, the first one on your list. It runs in production in systems doing tens of thousands of operations per second. cequel is an ORM and its latest version runs on top of cql-rb. If you decide on using cql-rb I'm happy to help you out with any problems you might have, just open an issue on the GitHub project page. yours Theo On Mon, Mar 17, 2014 at 6:55 PM, NORD SC jan.algermis...@nordsc.com wrote: Hi, I am looking for a Ruby driver that is production ready and truly supports CQL 3. Can anyone strongly recommend one in particular? I found - https://github.com/iconara/cql-rb - https://github.com/kreynolds/cassandra-cql - https://github.com/cequel/cequel Jan
Re: Proposal: freeze Thrift starting with 2.1.0
Speaking as a CQL driver maintainer (Ruby) I'm +1 for end-of-lining Thrift. I agree with Edward that it's unfortunate that there are no official drivers being maintained by the Cassandra maintainers -- even though the current state with the Datastax drivers is in practice very close (it is not the same thing though). However, I don't agree that not having drivers in the same repo/project is a problem. Whether or not there's a Java driver in the Cassandra source or not doesn't matter at all to us non-Java developers, and I don't see any difference between the situation where there's no driver in the source or just a Java driver. I might have misunderstood Edwards point about this, though. The CQL protocol is the key, as others have mentioned. As long as that is maintained, and respected I think it's absolutely fine not having any drivers shipped as part of Cassandra. However, I feel as this has not been the case lately. I'm thinking particularly about the UDT feature of 2.1, which is not a part of the CQL spec. There is no documentation on how drivers should handle them and what a user should be able to expect from a driver, they're completely implemented as custom types. I hope this will be fixed before 2.1 is released (and there's been good discussions on the mailing lists about how a driver should handle UDTs), but it shows a problem with the the-spec-is-the-thruth argument. I think we'll be fine as long as the spec is the truth, but that requires the spec to be the truth and new features to not be bolted on outside of the spec. T# On Wed, Mar 12, 2014 at 3:23 PM, Peter Lin wool...@gmail.com wrote: I'm enjoying the discussion also. @Brian I've been looking at spark/shark along with other recent developments the last few years. Berkeley has been doing some interesting stuff. One reason I like Thrift is for type safety and the benefits for query validation and query optimization. One could do similar things with CQL, but it's just more work, especially with dynamic columns. I know others are mixing static with dynamic columns, so I'm not alone. I have no clue how long it will take to get there, but having tools like query explanation is a big time saver. Writing business reports is hard enough, so every bit of help the tool can provide makes it less painful. On Wed, Mar 12, 2014 at 10:12 AM, Brian O'Neill b...@alumni.brown.eduwrote: just when you thought the thread died... First, let me say we are *WAY* off topic. But that is a good thing. I love this community because there are a ton of passionate, smart people. (often with differing perspectives ;) RE: Reporting against C* (@Peter Lin) We've had the same experience. Pig + Hadoop is painful. We are experimenting with Spark/Shark, operating directly against the data. http://brianoneill.blogspot.com/2014/03/spark-on-cassandra-w-calliope.html The Shark layer gives you SQL and caching capabilities that make it easy to use and fast (for smaller data sets). In front of this, we are going to add dimensional aggregations so we can operate at larger scales. (then the Hive reports will run against the aggregations) RE: REST Server (@Russel Bradbury) We had moderate success with Virgil, which was a REST server built directly on Thrift. We built it directly on top of Thrift, so one day it could be easily embedded in the C* server itself. It could be deployed separately, or run an embedded C*. More often than not, we ended up running it separately to separate the layers. (just like Titan and Rexster) I've started on a rewrite of Virgil called Memnon that rides on top of CQL. (I'd love some help) https://github.com/boneill42/memnon RE: CQL vs. Thrift We've hitched our wagons to CQL. CQL != Relational. We've had success translating our native schemas into CQL, including all the NoSQL goodness of wide-rows, etc. You just need a good understanding of how things translate into storage and underlying CFs. If anything, I think we could add some DESCRIBE information, which would help users with this, along the lines of: (https://issues.apache.org/jira/browse/CASSANDRA-6676) CQL does open up the *opportunity* for users to articulate more complex queries using more familiar syntax. (including future things such as joins, grouping, etc.) To me, that is exciting, and again -- one of the reasons we are leaning on it. my two cents, brian --- Brian O'Neill Chief Technology Officer *Health Market Science* *The Science of Better Results* 2700 Horizon Drive * King of Prussia, PA * 19406 M: 215.588.6024 * @boneill42 http://www.twitter.com/boneill42 * healthmarketscience.com This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender
Re: How to paginate through all columns in a row?
You can page yourself using the withColumnRange method (see the slice query example on the page you linked to). What you do is that you save the last column you got from the previous query, and you set that as the start of the range you pass to withColumnRange. You don't need to set an end of a range, but you want to set a max size. This code is just a quick rewrite from the page you linked to and I haven't checked that it worked, but it should give you an idea of where to start ColumnListString result; int pageSize = 100; String offset = Character.toString('\0'); while (true) { result = keyspace.prepareQuery(CF_STANDARD1) .getKey(rowKey) .withColumnRange(new RangeBuilder().setStart(offset).setMaxSize(pageSize).build()) .execute().getResult(); while (result.hasNext()) { ColumnString col = result.next(); // do something with your column here, then save // the last column to use as the offset when loading the next page offset = col.getStringValue(); } while (result.size() == pageSize); I'm using a string with a null byte as the first offset because that should sort before all strings, but there might be a better way of doing. If you have non-string columns or composite columns the exact way to do this is a bit different but I hope this shows you the general idea. T# On Thu, Feb 27, 2014 at 11:36 AM, Lu, Boying boying...@emc.com wrote: Hi, All, I'm using Netflix/Astyanax as a java cassandra client to access Cassandra DB. I need to paginate through all columns in a row and I found the document at https://github.com/Netflix/astyanax/wiki/Reading-Data about how to do that. But my requirement is a little different. I don't want to do paginate in 'one querying session', i.e. I don't want to hold the returned 'RowQuery' object to get next page. Is there any way that I can keep a 'marker' for next page, so by using the marker, I can tell the Cassandra DB that where to start query. e.g. the query result has three 'pages', Can I build the query by giving a marker pointed to the 'page 2' and Cassandra will return the second page of the query? Thanks a lot. Boying
Re: How should clients handle the user defined types in 2.1?
thanks for the high level description of the format, I'll see if I can make a stab at implementing support for custom types now. and maybe I should take all of the reverse engineering I've done of the type encoding and decoding and send a pull request for the protocol spec, or write an appendix. T# On Tue, Feb 25, 2014 at 12:10 PM, Sylvain Lebresne sylv...@datastax.comwrote: Is there any documentation on how CQL clients should handle the new user defined types coming in 2.1? There's nothing in the protocol specification on how to handle custom types as far as I can see. Can't say there is much documentation so far for that. As for the spec, it was written in a time where user defined types didn't existed and so as far as the protocol is concerned so far, user defined types are handled by the protocol as a custom type, i.e the full internal class is returned. And so ... For example, I tried creating the address type from the description of CASSANDRA-5590, and this is how its metadata looks (the metadata for a query contains a column with a custom type and this is the description of it): org.apache.cassandra.db.marshal.UserType(user_defined_types,61646472657373,737472656574:org.apache.cassandra.db.marshal.UTF8Type,63697479:org.apache.cassandra.db.marshal.UTF8Type,7a69705f636f6465:org.apache.cassandra.db.marshal.Int32Type,70686f6e6573:org.apache.cassandra.db.marshal.SetType(org.apache.cassandra.db.marshal.UTF8Type)) Is the client supposed to parse that description, and in that case how? ... yes, for now you're supposed to parse that description. Which is not really much documented outside of looking up the Cassandra code, but I can tell you that the first parameter of the UserType is the keyspace name the type has been defined in, the second is the type name hex encoded, and the rest is list of fields and their type. Each field name is hex encoded and separated from it's type by ':'. And that's about it. We will introduce much shorted definitions in the next iteration of the native protocol, but it's yet unclear when that will happen. -- Sylvain
Re: CQL decimal encoding
I don't know if it's by design or if it's by oversight that the data types aren't part of the binary protocol specification. I had to reverse engineer how to encode and decode all of them for the Ruby driver. There were definitely a few bugs in the first few versions that could have been avoided if there was a specification available. T# On Mon, Feb 24, 2014 at 8:43 PM, Paul LeoNerd Evans leon...@leonerd.org.uk wrote: On Mon, 24 Feb 2014 19:14:48 + Ben Hood 0x6e6...@gmail.com wrote: So I have a question about the encoding of 0: \x00\x00\x00\x00\x00. The first four octets are the decimal shift (0), and the remaining ones (one in this case) encode a varint - 0 in this case. So it's 0 * 10**0 literally zero. Technically the decimal shift matters not for zero - any four bytes could be given as the shift, ending in \x00, but 0 is the simplest. -- Paul LeoNerd Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/
How should clients handle the user defined types in 2.1?
(I posted this on the client-dev list the other day, but that list seems dead so I'm cross posting, sorry if it's the wrong thing to do) Hi, Is there any documentation on how CQL clients should handle the new user defined types coming in 2.1? There's nothing in the protocol specification on how to handle custom types as far as I can see. For example, I tried creating the address type from the description of CASSANDRA-5590, and this is how its metadata looks (the metadata for a query contains a column with a custom type and this is the description of it): org.apache.cassandra.db.marshal.UserType(user_defined_types,61646472657373,737472656574:org.apache.cassandra.db.marshal.UTF8Type,63697479:org.apache.cassandra.db.marshal.UTF8Type,7a69705f636f6465:org.apache.cassandra.db.marshal.Int32Type,70686f6e6573:org.apache.cassandra.db.marshal.SetType(org.apache.cassandra.db.marshal.UTF8Type)) Is the client supposed to parse that description, and in that case how? I could probably figure it out but it would be great if someone could point me to the right docs. yours, Theo (author of cql-rb, the Ruby driver)
Re: How should clients handle the user defined types in 2.1?
There hasn't been any activity (apart from my question) since december, and only sporadic activity before that, so I think it's essentially dead. http://www.mail-archive.com/client-dev@cassandra.apache.org/ T# On Mon, Feb 24, 2014 at 10:34 PM, Ben Hood 0x6e6...@gmail.com wrote: On Mon, Feb 24, 2014 at 7:52 PM, Theo Hultberg t...@iconara.net wrote: (I posted this on the client-dev list the other day, but that list seems dead so I'm cross posting, sorry if it's the wrong thing to do) I didn't even realize there was a list for driver implementors - is this used at all? Is it worth being on this list?
Re: manually removing sstable
thanks aaron, the second point I had not considered, and it could explain why the sstables don't always disapear completely, sometimes a small file (but megabytes instead of gigabytes) is left behind. T# On Fri, Jul 12, 2013 at 10:25 AM, aaron morton aa...@thelastpickle.comwrote: That sounds sane to me. Couple of caveats: * Remember that Expiring Columns turn into Tombstones and can only be purged after TTL and gc_grace. * Tombstones will only be purged if all fragments of a row are in the SStable(s) being compacted. Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/07/2013, at 10:17 PM, Theo Hultberg t...@iconara.net wrote: a colleague of mine came up with an alternative solution that also seems to work, and I'd just like your opinion on if it's sound. we run find to list all old sstables, and then use cmdline-jmxclient to run the forceUserDefinedCompaction function on each of them, this is roughly what we do (but with find and xargs to orchestrate it) java -jar cmdline-jmxclient-0.10.3.jar - localhost:7199 org.apache.cassandra.db:type=CompactionManager forceUserDefinedCompaction=the_keyspace,db_file_name the downside is that c* needs to read the file and do disk io, but the upside is that it doesn't require a restart. c* does a little more work, but we can schedule that during off-peak hours. another upside is that it feels like we're pretty safe from screwups, we won't accidentally remove an sstable with live data, the worst case is that we ask c* to compact an sstable with live data and end up with an identical sstable. if anyone else wants to do the same thing, this is the full cron command: 0 4 * * * find /path/to/cassandra/data/the_keyspace_name -maxdepth 1 -type f -name '*-Data.db' -mtime +8 -printf forceUserDefinedCompaction=the_keyspace_name,\%P\n | xargs -t --no-run-if-empty java -jar /usr/local/share/java/cmdline-jmxclient-0.10.3.jar - localhost:7199 org.apache.cassandra.db:type=CompactionManager just change the keyspace name and the path to the data directory. T# On Thu, Jul 11, 2013 at 7:09 AM, Theo Hultberg t...@iconara.net wrote: thanks a lot. I can confirm that it solved our problem too. looks like the C* 2.0 feature is perfect for us. T# On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson krum...@gmail.comwrote: yep that works, you need to remove all components of the sstable though, not just -Data.db and, in 2.0 there is this: https://issues.apache.org/jira/browse/CASSANDRA-5228 /Marcus On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg t...@iconara.net wrote: Hi, I think I remember reading that if you have sstables that you know contain only data that whose ttl has expired, it's safe to remove them manually by stopping c*, removing the *-Data.db files and then starting up c* again. is this correct? we have a cluster where everything is written with a ttl, and sometimes c* needs to compact over a 100 gb of sstables where we know ever has expired, and we'd rather just manually get rid of those. T#
Re: Extract meta-data using cql 3
there's a keyspace called system which has a few tables that contain the metadata. for example schema_keyspaces that contain keyspace metadata, and schema_columnfamilies that contain table metadata. there are more, just fire up cqlsh and do a describe keyspace in the system keyspace to find them. T# On Fri, Jul 12, 2013 at 10:52 AM, Murali muralidharan@gmail.com wrote: Hi experts, How to extract meta-data of a table or a keyspace using CQL 3.0? -- Thanks, Murali
manually removing sstable
Hi, I think I remember reading that if you have sstables that you know contain only data that whose ttl has expired, it's safe to remove them manually by stopping c*, removing the *-Data.db files and then starting up c* again. is this correct? we have a cluster where everything is written with a ttl, and sometimes c* needs to compact over a 100 gb of sstables where we know ever has expired, and we'd rather just manually get rid of those. T#
Re: manually removing sstable
thanks a lot. I can confirm that it solved our problem too. looks like the C* 2.0 feature is perfect for us. T# On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson krum...@gmail.com wrote: yep that works, you need to remove all components of the sstable though, not just -Data.db and, in 2.0 there is this: https://issues.apache.org/jira/browse/CASSANDRA-5228 /Marcus On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg t...@iconara.net wrote: Hi, I think I remember reading that if you have sstables that you know contain only data that whose ttl has expired, it's safe to remove them manually by stopping c*, removing the *-Data.db files and then starting up c* again. is this correct? we have a cluster where everything is written with a ttl, and sometimes c* needs to compact over a 100 gb of sstables where we know ever has expired, and we'd rather just manually get rid of those. T#
Re: does anyone store large values in cassandra e.g. 100kb?
We store objects that are a couple of tens of K, sometimes 100K, and we store quite a few of these per row, sometimes hundreds of thousands. One problem we encountered early was that these rows would become so big that C* couldn't compact the rows in-memory and had to revert to slow two-pass compactions where it spills partially compacted rows to disk. we solved that in two ways, first by increasing in_memory_compaction_limit_in_mb from 64 to 128, and although it helped a little bit we quickly realized didn't have much effect because most of the time was taken up by really huge rows many times larger than that. We ended up implementing a simple sharding scheme where each row is actually 36 rows that each contain 1/36 of the range (we take the first letter in the column key and stick that on the row key on writes, and on reads we read all 36 rows -- 36 because there are 36 letters and numbers in the ascii alphabet and our column keys happen to distribute over that quite nicely). Cassandra works well with semi-large objects, and it works well with wide rows, but you have to be careful about the combination where rows get larger than 64 Mb. T# On Mon, Jul 8, 2013 at 8:13 PM, S Ahmed sahmed1...@gmail.com wrote: Hi Peter, Can you describe your environment, # of documents and what kind of usage pattern you have? On Mon, Jul 8, 2013 at 2:06 PM, Peter Lin wool...@gmail.com wrote: I regularly store word and pdf docs in cassandra without any issues. On Mon, Jul 8, 2013 at 1:46 PM, S Ahmed sahmed1...@gmail.com wrote: I'm guessing that most people use cassandra to store relatively smaller payloads like 1-5kb in size. Is there anyone using it to store say 100kb (1/10 of a megabyte) and if so, was there any tweaking or gotchas that you ran into?
Re: does anyone store large values in cassandra e.g. 100kb?
yes, by splitting the rows into 36 parts it's very rare that any part gets big enough to impact the clusters performance. there are still rows that are bigger than the in memory compaction limit, but when it's only some it doesn't matter as much. T# On Tue, Jul 9, 2013 at 5:43 PM, S Ahmed sahmed1...@gmail.com wrote: So was the point of breaking into 36 parts to bring each row to the 64 or 128mb threshold? On Tue, Jul 9, 2013 at 3:18 AM, Theo Hultberg t...@iconara.net wrote: We store objects that are a couple of tens of K, sometimes 100K, and we store quite a few of these per row, sometimes hundreds of thousands. One problem we encountered early was that these rows would become so big that C* couldn't compact the rows in-memory and had to revert to slow two-pass compactions where it spills partially compacted rows to disk. we solved that in two ways, first by increasing in_memory_compaction_limit_in_mb from 64 to 128, and although it helped a little bit we quickly realized didn't have much effect because most of the time was taken up by really huge rows many times larger than that. We ended up implementing a simple sharding scheme where each row is actually 36 rows that each contain 1/36 of the range (we take the first letter in the column key and stick that on the row key on writes, and on reads we read all 36 rows -- 36 because there are 36 letters and numbers in the ascii alphabet and our column keys happen to distribute over that quite nicely). Cassandra works well with semi-large objects, and it works well with wide rows, but you have to be careful about the combination where rows get larger than 64 Mb. T# On Mon, Jul 8, 2013 at 8:13 PM, S Ahmed sahmed1...@gmail.com wrote: Hi Peter, Can you describe your environment, # of documents and what kind of usage pattern you have? On Mon, Jul 8, 2013 at 2:06 PM, Peter Lin wool...@gmail.com wrote: I regularly store word and pdf docs in cassandra without any issues. On Mon, Jul 8, 2013 at 1:46 PM, S Ahmed sahmed1...@gmail.com wrote: I'm guessing that most people use cassandra to store relatively smaller payloads like 1-5kb in size. Is there anyone using it to store say 100kb (1/10 of a megabyte) and if so, was there any tweaking or gotchas that you ran into?
Re: What is best Cassandra client?
Datastax Java driver: https://github.com/datastax/java-driver T# On Thu, Jul 4, 2013 at 10:25 AM, Tony Anecito adanec...@yahoo.com wrote: Hi All, What is the best client to use? I want to use CQL 3.0.3 and have support for preparedStatmements. I tried JDBC and the thrift client so far. Thanks!
Re: Performance issues with CQL3 collections?
the thing I was doing was definitely triggering the range tombstone issue, this is what I was doing: UPDATE clocks SET clock = ? WHERE shard = ? in this table: CREATE TABLE clocks (shard INT PRIMARY KEY, clock MAPTEXT, TIMESTAMP) however, from the stack overflow posts it sounds like they aren't necessarily overwriting their collections. I've tried to replicate their problem with these two statements INSERT INTO clocks (shard, clock) VALUES (?, ?) UPDATE clocks SET clock = clock + ? WHERE shard = ? the first one should create range tombstones because it overwrites the the map on every insert, and the second should not because it adds to the map. neither of those seems to have any performance issues, at least not on inserts. and it's the slowdown on inserts that confuses me, both the stack overflow questioners say that they saw a drop in insert performance. I never saw that in my application, I just got slow reads (and Fabien's explanation makes complete sense for that). I don't understand how insert performance could be affected at all, and I know that for non-counter columns cassandra doesn't read before it writes, but is it the same for collections too? they are a bit special, but how special are they? T# On Fri, Jun 28, 2013 at 7:04 AM, aaron morton aa...@thelastpickle.comwrote: Can you provide details of the mutation statements you are running ? The Stack Overflow posts don't seem to include them. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 27/06/2013, at 5:58 AM, Theo Hultberg t...@iconara.net wrote: do I understand it correctly if I think that collection modifications are done by reading the collection, writing a range tombstone that would cover the collection and then re-writing the whole collection again? or is it just the modified parts of the collection that are covered by the range tombstones, but you still get massive amounts of them and its just their number that is the problem. would this explain the slowdown of writes too? I guess it would if cassandra needed to read the collection before it wrote the new values, otherwise I don't understand how this affects writes, but that only says how much I know about how this works. T# On Wed, Jun 26, 2013 at 10:48 AM, Fabien Rousseau fab...@yakaz.comwrote: Hi, I'm pretty sure that it's related to this ticket : https://issues.apache.org/jira/browse/CASSANDRA-5677 I'd be happy if someone tests this patch. It should apply easily on 1.2.5 1.2.6 After applying the patch, by default, the current implementation is still used, but modify your cassandra.yaml to add the following one : interval_tree_provider: IntervalTreeAvlProvider (Note that implementations should be interchangeable, because they share the same serializers and deserializers) Also, please note that this patch has not been reviewed nor intensively tested... So, it may not be production ready Fabien 2013/6/26 Theo Hultberg t...@iconara.net Hi, I've seen a couple of people on Stack Overflow having problems with performance when they have maps that they continuously update, and in hindsight I think I might have run into the same problem myself (but I didn't suspect it as the reason and designed differently and by accident didn't use maps anymore). Is there any reason that maps (or lists or sets) in particular would become a performance issue when they're heavily modified? As I've understood them they're not special, and shouldn't be any different performance wise than overwriting regular columns. Is there something different going on that I'm missing? Here are the Stack Overflow questions: http://stackoverflow.com/questions/17282837/cassandra-insert-perfomance-issue-into-a-table-with-a-map-type/17290981 http://stackoverflow.com/questions/17082963/bad-performance-when-writing-log-data-to-cassandra-with-timeuuid-as-a-column-nam/17123236 yours, Theo -- Fabien Rousseau * * aur...@yakaz.comwww.yakaz.com
Performance issues with CQL3 collections?
Hi, I've seen a couple of people on Stack Overflow having problems with performance when they have maps that they continuously update, and in hindsight I think I might have run into the same problem myself (but I didn't suspect it as the reason and designed differently and by accident didn't use maps anymore). Is there any reason that maps (or lists or sets) in particular would become a performance issue when they're heavily modified? As I've understood them they're not special, and shouldn't be any different performance wise than overwriting regular columns. Is there something different going on that I'm missing? Here are the Stack Overflow questions: http://stackoverflow.com/questions/17282837/cassandra-insert-perfomance-issue-into-a-table-with-a-map-type/17290981 http://stackoverflow.com/questions/17082963/bad-performance-when-writing-log-data-to-cassandra-with-timeuuid-as-a-column-nam/17123236 yours, Theo
Re: Performance issues with CQL3 collections?
do I understand it correctly if I think that collection modifications are done by reading the collection, writing a range tombstone that would cover the collection and then re-writing the whole collection again? or is it just the modified parts of the collection that are covered by the range tombstones, but you still get massive amounts of them and its just their number that is the problem. would this explain the slowdown of writes too? I guess it would if cassandra needed to read the collection before it wrote the new values, otherwise I don't understand how this affects writes, but that only says how much I know about how this works. T# On Wed, Jun 26, 2013 at 10:48 AM, Fabien Rousseau fab...@yakaz.com wrote: Hi, I'm pretty sure that it's related to this ticket : https://issues.apache.org/jira/browse/CASSANDRA-5677 I'd be happy if someone tests this patch. It should apply easily on 1.2.5 1.2.6 After applying the patch, by default, the current implementation is still used, but modify your cassandra.yaml to add the following one : interval_tree_provider: IntervalTreeAvlProvider (Note that implementations should be interchangeable, because they share the same serializers and deserializers) Also, please note that this patch has not been reviewed nor intensively tested... So, it may not be production ready Fabien 2013/6/26 Theo Hultberg t...@iconara.net Hi, I've seen a couple of people on Stack Overflow having problems with performance when they have maps that they continuously update, and in hindsight I think I might have run into the same problem myself (but I didn't suspect it as the reason and designed differently and by accident didn't use maps anymore). Is there any reason that maps (or lists or sets) in particular would become a performance issue when they're heavily modified? As I've understood them they're not special, and shouldn't be any different performance wise than overwriting regular columns. Is there something different going on that I'm missing? Here are the Stack Overflow questions: http://stackoverflow.com/questions/17282837/cassandra-insert-perfomance-issue-into-a-table-with-a-map-type/17290981 http://stackoverflow.com/questions/17082963/bad-performance-when-writing-log-data-to-cassandra-with-timeuuid-as-a-column-nam/17123236 yours, Theo -- Fabien Rousseau * * aur...@yakaz.comwww.yakaz.com
cql-rb, the CQL3 driver for Ruby has reached v1.0
After a few months of development and many preview releases cql-rb, the pure Ruby CQL3 driver has finally reached v1.0. You can find the code and examples on GitHub: https://github.com/iconara/cql-rb T#
Re: Why so many vnodes?
But in the paragraph just before Richard said that finding the node that owns a token becomes slower on large clusters with lots of token ranges, so increasing it further seems contradictory. Is this a correct interpretation: finding the node that owns a particular token becomes slower as the number of nodes (and therefore total token ranges) increases, but for large clusters you also need to take the time for bootstraps into account, which will become slower if each node has fewer token ranges. The speed referred to in the two cases are the speeds of different operations, and there will be a trade off, and 256 initial tokens is a trade off that works for most cases. T# On Tue, Jun 11, 2013 at 8:37 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: I think he actually meant *increase*, for this reason For small T, a random choice of initial tokens will in most cases give a poor distribution of data. The larger T is, the closer to uniform the distribution will be, with increasing probability. Alain 2013/6/11 Theo Hultberg t...@iconara.net thanks, that makes sense, but I assume in your last sentence you mean decrease it for large clusters, not increase it? T# On Mon, Jun 10, 2013 at 11:02 PM, Richard Low rich...@wentnet.comwrote: Hi Theo, The number (let's call it T and the number of nodes N) 256 was chosen to give good load balancing for random token assignments for most cluster sizes. For small T, a random choice of initial tokens will in most cases give a poor distribution of data. The larger T is, the closer to uniform the distribution will be, with increasing probability. Also, for small T, when a new node is added, it won't have many ranges to split so won't be able to take an even slice of the data. For this reason T should be large. But if it is too large, there are too many slices to keep track of as you say. The function to find which keys live where becomes more expensive and operations that deal with individual vnodes e.g. repair become slow. (An extreme example is SELECT * LIMIT 1, which when there is no data has to scan each vnode in turn in search of a single row. This is O(NT) and for even quite small T takes seconds to complete.) So 256 was chosen to be a reasonable balance. I don't think most users will find it too slow; users with extremely large clusters may need to increase it. Richard. On 10 June 2013 18:55, Theo Hultberg t...@iconara.net wrote: I'm not sure I follow what you mean, or if I've misunderstood what Cassandra is telling me. Each node has 256 vnodes (or tokens, as the prefered name seems to be). When I run `nodetool status` each node is reported as having 256 vnodes, regardless of how many nodes are in the cluster. A single node cluster has 256 vnodes on the single node, a six node cluster has 256 nodes on each machine, making 1590 vnodes in total. When I run `SELECT tokens FROM system.peers` or `nodetool ring` each node lists 256 tokens. This is different from how it works in Riak and Voldemort, if I'm not mistaken, and that is the source of my confusion. T# On Mon, Jun 10, 2013 at 4:54 PM, Milind Parikh milindpar...@gmail.comwrote: There are n vnodes regardless of the size of the physical cluster. Regards Milind On Jun 10, 2013 7:48 AM, Theo Hultberg t...@iconara.net wrote: Hi, The default number of vnodes is 256, is there any significance in this number? Since Cassandra's vnodes don't work like for example Riak's, where there is a fixed number of vnodes distributed evenly over the nodes, why so many? Even with a moderately sized cluster you get thousands of slices. Does this matter? If your cluster grows to over thirty machines and you start looking at ten thousand slices, would that be a problem? I guess trat traversing a list of a thousand or ten thousand slices to find where a token lives isn't a huge problem, but are there any other up or downsides to having a small or large number of vnodes per node? I understand the benefits for splitting up the ring into pieces, for example to be able to stream data from more nodes when bootstrapping a new one, but that works even if each node only has say 32 vnodes (unless your cluster is truly huge). yours, Theo
Why so many vnodes?
Hi, The default number of vnodes is 256, is there any significance in this number? Since Cassandra's vnodes don't work like for example Riak's, where there is a fixed number of vnodes distributed evenly over the nodes, why so many? Even with a moderately sized cluster you get thousands of slices. Does this matter? If your cluster grows to over thirty machines and you start looking at ten thousand slices, would that be a problem? I guess trat traversing a list of a thousand or ten thousand slices to find where a token lives isn't a huge problem, but are there any other up or downsides to having a small or large number of vnodes per node? I understand the benefits for splitting up the ring into pieces, for example to be able to stream data from more nodes when bootstrapping a new one, but that works even if each node only has say 32 vnodes (unless your cluster is truly huge). yours, Theo
Re: Why so many vnodes?
I'm not sure I follow what you mean, or if I've misunderstood what Cassandra is telling me. Each node has 256 vnodes (or tokens, as the prefered name seems to be). When I run `nodetool status` each node is reported as having 256 vnodes, regardless of how many nodes are in the cluster. A single node cluster has 256 vnodes on the single node, a six node cluster has 256 nodes on each machine, making 1590 vnodes in total. When I run `SELECT tokens FROM system.peers` or `nodetool ring` each node lists 256 tokens. This is different from how it works in Riak and Voldemort, if I'm not mistaken, and that is the source of my confusion. T# On Mon, Jun 10, 2013 at 4:54 PM, Milind Parikh milindpar...@gmail.comwrote: There are n vnodes regardless of the size of the physical cluster. Regards Milind On Jun 10, 2013 7:48 AM, Theo Hultberg t...@iconara.net wrote: Hi, The default number of vnodes is 256, is there any significance in this number? Since Cassandra's vnodes don't work like for example Riak's, where there is a fixed number of vnodes distributed evenly over the nodes, why so many? Even with a moderately sized cluster you get thousands of slices. Does this matter? If your cluster grows to over thirty machines and you start looking at ten thousand slices, would that be a problem? I guess trat traversing a list of a thousand or ten thousand slices to find where a token lives isn't a huge problem, but are there any other up or downsides to having a small or large number of vnodes per node? I understand the benefits for splitting up the ring into pieces, for example to be able to stream data from more nodes when bootstrapping a new one, but that works even if each node only has say 32 vnodes (unless your cluster is truly huge). yours, Theo
Re: Why so many vnodes?
thanks, that makes sense, but I assume in your last sentence you mean decrease it for large clusters, not increase it? T# On Mon, Jun 10, 2013 at 11:02 PM, Richard Low rich...@wentnet.com wrote: Hi Theo, The number (let's call it T and the number of nodes N) 256 was chosen to give good load balancing for random token assignments for most cluster sizes. For small T, a random choice of initial tokens will in most cases give a poor distribution of data. The larger T is, the closer to uniform the distribution will be, with increasing probability. Also, for small T, when a new node is added, it won't have many ranges to split so won't be able to take an even slice of the data. For this reason T should be large. But if it is too large, there are too many slices to keep track of as you say. The function to find which keys live where becomes more expensive and operations that deal with individual vnodes e.g. repair become slow. (An extreme example is SELECT * LIMIT 1, which when there is no data has to scan each vnode in turn in search of a single row. This is O(NT) and for even quite small T takes seconds to complete.) So 256 was chosen to be a reasonable balance. I don't think most users will find it too slow; users with extremely large clusters may need to increase it. Richard. On 10 June 2013 18:55, Theo Hultberg t...@iconara.net wrote: I'm not sure I follow what you mean, or if I've misunderstood what Cassandra is telling me. Each node has 256 vnodes (or tokens, as the prefered name seems to be). When I run `nodetool status` each node is reported as having 256 vnodes, regardless of how many nodes are in the cluster. A single node cluster has 256 vnodes on the single node, a six node cluster has 256 nodes on each machine, making 1590 vnodes in total. When I run `SELECT tokens FROM system.peers` or `nodetool ring` each node lists 256 tokens. This is different from how it works in Riak and Voldemort, if I'm not mistaken, and that is the source of my confusion. T# On Mon, Jun 10, 2013 at 4:54 PM, Milind Parikh milindpar...@gmail.comwrote: There are n vnodes regardless of the size of the physical cluster. Regards Milind On Jun 10, 2013 7:48 AM, Theo Hultberg t...@iconara.net wrote: Hi, The default number of vnodes is 256, is there any significance in this number? Since Cassandra's vnodes don't work like for example Riak's, where there is a fixed number of vnodes distributed evenly over the nodes, why so many? Even with a moderately sized cluster you get thousands of slices. Does this matter? If your cluster grows to over thirty machines and you start looking at ten thousand slices, would that be a problem? I guess trat traversing a list of a thousand or ten thousand slices to find where a token lives isn't a huge problem, but are there any other up or downsides to having a small or large number of vnodes per node? I understand the benefits for splitting up the ring into pieces, for example to be able to stream data from more nodes when bootstrapping a new one, but that works even if each node only has say 32 vnodes (unless your cluster is truly huge). yours, Theo
Re: [Cassandra] Conflict resolution in Cassandra
Like Edward says Cassandra's conflict resolution strategy is LWW (last write wins). This may seem simplistic, but Cassandra's Big Query-esque data model makes it less of an issue than in a pure key/value-store like Riak, for example. When all you have is an opaque value for a key you want to be able to do things like keeping conflicting writes so that you can resolve them later. Since Cassandra's rows aren't opaque, but more like a sorted map LWW is almost always enough. With Cassandra you can add new columns/cells to a row from multiple clients without having to worry about conflicts. It's only when multiple clients write to the same column/cell that there is an issue, but in that case you usually can (and you probably should) model your way around that. T# On Fri, Jun 7, 2013 at 4:51 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Conflicts are managed at the column level. 1) If two columns have the same name the column with the highest timestamp wins. 2) If two columns have the same column name and the same timestamp the value of the column is compared and the highest* wins. Someone correct me if I am wrong about the *. I know the algorithm is deterministic, I do not remember if it is highest or lowest. On Thu, Jun 6, 2013 at 6:25 PM, Emalayan Vairavanathan svemala...@yahoo.com wrote: I tried google and found conflicting answers. Thats why wanted to double check with user forum. Thanks -- *From:* Bryan Talbot btal...@aeriagames.com *To:* user@cassandra.apache.org; Emalayan Vairavanathan svemala...@yahoo.com *Sent:* Thursday, 6 June 2013 3:19 PM *Subject:* Re: [Cassandra] Conflict resolution in Cassandra For generic questions like this, google is your friend: http://lmgtfy.com/?q=cassandra+conflict+resolution -Bryan On Thu, Jun 6, 2013 at 11:23 AM, Emalayan Vairavanathan svemala...@yahoo.com wrote: Hi All, Can someone tell me about the conflict resolution mechanisms provided by Cassandra? More specifically does Cassandra provides a way to define application specific conflict resolution mechanisms (per row basis / column basis)? or Does it automatically manage the conflicts based on some synchronization algorithms ? Thank you Emalayan
Re: Getting error Too many in flight hints
thanks a lot for the explanation. if I understand it correctly it basically back pressure from C*, it's telling me that it's overloaded and that I need to back off. I better start a few more nodes, I guess. T# On Thu, May 30, 2013 at 10:47 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, May 30, 2013 at 8:24 AM, Theo Hultberg t...@iconara.net wrote: I'm using Cassandra 1.2.4 on EC2 (3 x m1.large, this is a test cluster), and my application is talking to it over the binary protocol (I'm using JRuby and the cql-rb driver). I get this error quite frequently: Too many in flight hints: 2411 (the exact number varies) Has anyone any idea of what's causing it? I'm pushing the cluster quite hard with writes (but no reads at all). The code that produces this message (below) sets the bound based on the number of available processors. It is a bound of number of in progress hints. An in progress hint (for some reason redundantly referred to as in flight) is a hint which has been submitted to the executor which will ultimately write it to local disk. If you get OverloadedException, this means that you were trying to write hints to this executor so fast that you risked OOM, so Cassandra refused to submit your hint to the hint executor and therefore (partially) failed your write. private static volatile int maxHintsInProgress = 1024 * FBUtilities.getAvailableProcessors(); [... snip ...] for (InetAddress destination : targets) { // avoid OOMing due to excess hints. we need to do this check even for live nodes, since we can // still generate hints for those if it's overloaded or simply dead but not yet known-to-be-dead. // The idea is that if we have over maxHintsInProgress hints in flight, this is probably due to // a small number of nodes causing problems, so we should avoid shutting down writes completely to // healthy nodes. Any node with no hintsInProgress is considered healthy. if (totalHintsInProgress.get() maxHintsInProgress (hintsInProgress.get(destination).get() 0 shouldHint(destination))) { throw new OverloadedException(Too many in flight hints: + totalHintsInProgress.get()); } If Cassandra didn't return this exception, it might OOM while enqueueing your hints to be stored. Giving up on trying to enqueue a hint for the failed write is chosen instead. The solution is to reduce your write rate, ideally by enough that you don't even queue hints in the first place. =Rob
Getting error Too many in flight hints
Hi, I'm using Cassandra 1.2.4 on EC2 (3 x m1.large, this is a test cluster), and my application is talking to it over the binary protocol (I'm using JRuby and the cql-rb driver). I get this error quite frequently: Too many in flight hints: 2411 (the exact number varies) Has anyone any idea of what's causing it? I'm pushing the cluster quite hard with writes (but no reads at all). T#
Re: Limit on the size of a list
In the CQL3 protocol the sizes of collections are unsigned shorts, so the maximum number of elements in a LIST... is 65,536. There's no check, afaik, that stops you from creating lists that are bigger than that, but the protocol doesn't handle returning them (you get the first N - 65536 % 65536 items). On the other hand the JDBC driver doesn't talk over the binary protocol but Thrift, doesn't it? In that case there may be other limits. T# On Mon, May 13, 2013 at 3:26 AM, Edward Capriolo edlinuxg...@gmail.comwrote: 2 billion is the maximum theoretically limit of columns under a row. It is NOT the maximum limit of a CQL collection. The design of CQL collections currently require retrieving the entire collection on read. On Sun, May 12, 2013 at 11:13 AM, Robert Wille rwi...@footnote.comwrote: I designed a data model for my data that uses a list of UUID's in a column. When I designed my data model, my expectation was that most of the lists would have fewer than a hundred elements, with a few having several thousand. I discovered in my data a list that has nearly 400,000 items in it. When I try to retrieve it, I get the following exception: java.lang.IllegalArgumentException: Illegal Capacity: -14594 at java.util.ArrayList.init(ArrayList.java:110) at org.apache.cassandra.cql.jdbc.ListMaker.compose(ListMaker.java:54) at org.apache.cassandra.cql.jdbc.TypedColumn.init(TypedColumn.java:68) at org.apache.cassandra.cql.jdbc.CassandraResultSet.createColumn(CassandraResu ltSet.java:1086) at org.apache.cassandra.cql.jdbc.CassandraResultSet.populateColumns(CassandraR esultSet.java:161) at org.apache.cassandra.cql.jdbc.CassandraResultSet.init(CassandraResultSet. java:134) at org.apache.cassandra.cql.jdbc.CassandraStatement.doExecute(CassandraStateme nt.java:166) at org.apache.cassandra.cql.jdbc.CassandraStatement.executeQuery(CassandraStat ement.java:226) I get this with Cassandra 1.2.4 and the latest snapshot of the JDBC driver. Admittedly, several hundred thousand is quite a lot of items, but odd that I'm getting some kind of wraparound, since 400,000 is a long ways from 2 billion. What are the physical and practical limits on the size of a list? Is it possible to retrieve a range of items from a list? Thanks in advance Robert
New CQL3 driver for Ruby
Hi, For the last few weeks I've been working on a CQL3 driver for Ruby. If you're using Ruby and Cassandra I would very much like your help getting it production ready. You can find the code and documentation here: https://github.com/iconara/cql-rb The driver supports the full CQL3 protocol except for authentication. It's implemented purely in Ruby and has no dependencies. If you try it out and find a bug (which I'm sure you will), please email me directy (t...@iconara.net) or open an issue in the GitHub project. yours, Theo
Re: cql: show tables in a keystone
the DESCRIBE family of commands in cqlsh are wrappers around queries to the system keyspace, so if you want to inspect what keyspaces and tables exist from your application you can do something like: SELECT columnfamily_name, comment FROM system.schema_columnfamilies WHERE keyspace_name = 'test'; or SELECT * FROM system.schema_keyspaces; T# On Mon, Jan 28, 2013 at 8:35 PM, Brian O'Neill b...@alumni.brown.eduwrote: cqlsh use keyspace; cqlsh:cirrus describe tables; For more info: cqlsh help describe -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited. On 1/28/13 2:27 PM, Paul van Hoven paul.van.ho...@googlemail.com wrote: Is there some way in cql to get a list of all tables or column families that belong to a keystore like show tables in sql?
Re: CQL3 Frame Length
Hi, Another reason for keeping the frame length in the header is that newer versions can add fields to frames without older clients breaking. For example a minor release can add some more content to an existing frame without older clients breaking. If clients didn't know the full frame length (and were required by the specification to consume all the bytes) there would be trailing garbage which would most likely crash the client. T# Hey Sylvain, Thanks for explaining the rationale. When you look at from the perspective of the use cases you mention, it makes sense to be able to supply the reader with the frame size up front. I've opted to go for serializing the frame into a buffer. Although this could materialize an arbitrarily large amount of memory, ultimately the driving application has control of the degree to which this can occur, so in the grander scheme of things, you can still maintain streaming semantics. Thanks for the heads up. Cheers, Ben On Tue, Jan 8, 2013 at 4:08 PM, Sylvain Lebresne sylv...@datastax.com wrote: Mostly this is because having the frame length is convenient to have in practice. Without pretending that there is only one way to write a server, it is common to separate the phase read a frame from the network from the phase decode the frame which is often simpler if you can read the frame upfront. Also, if you don't have the frame size, it means you need to decode the whole frame before being able to decode the next one, and so you can't parallelize the decoding. It is true however that it means for the write side that you need to either be able to either pre-compute the frame body size or to serialize it in memory first. That's a trade of for making it easier on the read side. But if you want my opinion, on the write side too it's probably worth parallelizing the message encoding (which require you encode it in memory first) since it's an asynchronous protocol and so there will likely be multiple writer simultaneously. -- Sylvain On Tue, Jan 8, 2013 at 12:48 PM, Ben Hood 0x6e6...@gmail.com wrote: Hi, I've read the CQL wire specification and naively, I can't see how the frame length length header is used. To me, it looks like on the read side, you know which type of structures to expect based on the opcode and each structure is TLV encoded. On the write side, you need to encode TLV structures as well, but you don't know the overall frame length until you've encoded it. So it would seem that you either need to pre-calculate the cumulative TLV size before you serialize the frame body, or you serialize the frame body to a buffer which you can then get the size of and then write to the socket, after having first written the count out. Is there potentially an implicit assumption that the reader will want to pre-buffer the entire frame before decoding it? Cheers, Ben