Re: Exploring Simply Queueing
Chris, thanks for taking a look. On 06 Oct 2014, at 04:44, Chris Lohfink clohf...@blackbirdit.com wrote: It appears you are aware of the tombstones affect that leads people to label this an anti-pattern. Without due or any time based value being part of the partition key means you will still get a lot of buildup. You only have 1 partition per shard which just linearly decreases the tombstones. That isn't likely to be enough to really help in a situation of high queue throughput, especially with the default of 4 shards. Yes, dealing with the tombstones effect is the whole point. The work loads I have to deal with are not really high throughput, it is unlikely we’ll ever reach multiple messages per second.The emphasis is also more on coordinating producer and consumer than on high volume capacity problems. Your comment seems to suggest to include larger time frames (e.g. the due-hour) in the partition keys and use the current time to select the active partitions (e.g. the shards of the hour). Once an hour has passed, the corresponding shards will never be touched again. Am I understanding this correctly? You may want to consider switching to LCS from the default STCS since re-writing to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios, with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput. In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones. Would be hard to identify difference in small tests I think. Thanks, I’ll try to explore the various effects Whats the plan to prevent two consumers from reading same message off of a queue? You mention in docs you will address it at a later point in time but its kinda a biggy. Big lock batch reads like astyanax recipe? I have included a static column per shard to act as a lock (the ’lock’ column in the examples) in combination with conditional updates. I must admit, I have not quite understood what Netfix is doing in terms of coordination - but since performance isn’t our concern, CAS should do fine, I guess(?) Thanks again, Jan --- Chris Lohfink On Oct 5, 2014, at 6:03 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Hi, I have put together some thoughts on realizing simple queues with Cassandra. https://github.com/algermissen/cassandra-ruby-queue The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced. Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took. Jan [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
Re: Increasing size of Batch of prepared statements
Thanks Jens for the comment. Actually I am using Cassandra Stress Tool and this is the tools who inserts such a large statements. But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra? What shall i do if I want columns with large size? best, /Shahab On Sun, Oct 5, 2014 at 6:03 PM, Jens Rantil jens.ran...@tink.se wrote: Shabab, If you are hitting this limit because you are inserting a lot of (CQL) rows in a single batch I suggest you split the statement up in multiple smaller batches. Generally, large inserts like this will not perform very well. Cheers, Jens — Sent from Mailbox https://www.dropbox.com/mailbox On Fri, Oct 3, 2014 at 6:47 PM, shahab shahab.mok...@gmail.com wrote: Hi, I am getting the following warning in the cassandra log: BatchStatement.java:258 - Batch of prepared statements for [mydb.mycf] is of size 3272725, exceeding specified threshold of 5120 by 3267605. Apparently it relates to the (default) size of prepared insert statement . Is there any way to change the default value? thanks /Shahab
Re: Exploring Simply Queueing
Sorry if I'm hijacking the conversation, but why in the world would you want to implement a queue on top of Cassandra? It seems like using a proper queuing service would make your life a lot easier. That being said, there might be a better way to play to the strengths of C*. Ideally everything you do is append only with few deletes or updates. So an interesting way to implement a queue might be to do one insert to put the job in the queue and another insert to mark the job as done or in process or whatever. This would also give you the benefit of being able to replay the state of the queue. On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen jan.algermis...@nordsc.com wrote: Chris, thanks for taking a look. On 06 Oct 2014, at 04:44, Chris Lohfink clohf...@blackbirdit.com wrote: It appears you are aware of the tombstones affect that leads people to label this an anti-pattern. Without due or any time based value being part of the partition key means you will still get a lot of buildup. You only have 1 partition per shard which just linearly decreases the tombstones. That isn't likely to be enough to really help in a situation of high queue throughput, especially with the default of 4 shards. Yes, dealing with the tombstones effect is the whole point. The work loads I have to deal with are not really high throughput, it is unlikely we’ll ever reach multiple messages per second.The emphasis is also more on coordinating producer and consumer than on high volume capacity problems. Your comment seems to suggest to include larger time frames (e.g. the due-hour) in the partition keys and use the current time to select the active partitions (e.g. the shards of the hour). Once an hour has passed, the corresponding shards will never be touched again. Am I understanding this correctly? You may want to consider switching to LCS from the default STCS since re-writing to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios, with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput. In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones. Would be hard to identify difference in small tests I think. Thanks, I’ll try to explore the various effects Whats the plan to prevent two consumers from reading same message off of a queue? You mention in docs you will address it at a later point in time but its kinda a biggy. Big lock batch reads like astyanax recipe? I have included a static column per shard to act as a lock (the ’lock’ column in the examples) in combination with conditional updates. I must admit, I have not quite understood what Netfix is doing in terms of coordination - but since performance isn’t our concern, CAS should do fine, I guess(?) Thanks again, Jan --- Chris Lohfink On Oct 5, 2014, at 6:03 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Hi, I have put together some thoughts on realizing simple queues with Cassandra. https://github.com/algermissen/cassandra-ruby-queue The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced. Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took. Jan [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
Re: CQL query throws TombstoneOverwhelmingException against a LeveledCompactionStrategy table
BTW, I am using Cassandra 2.0.6. Is this the same as CASSANDRA-6654 (Droppable tombstones are not being removed from LCS table despite being above 20%) https://issues.apache.org/jira/browse/CASSANDRA-6654 ? I checked my table in JConsole and the droppable tombstone ratio of over 60%. If it is of the same cause, does that mean I should switch to SizeTieredCompactionStrategy? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/CQL-query-throws-TombstoneOverwhelmingException-against-a-LeveledCompactionStrategy-table-tp7597077p7597091.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Exploring Simply Queueing
Hi Jan, Both Chris and Shane say what I believe the correct thinking. Just let you know if you base your implementation on Netflix's queue recipe, there are many issues with it. In general, we don't advise people to use that recipe so I suggest you to save your time by not going that same route again. Minh On Mon, Oct 6, 2014 at 7:34 AM, Shane Hansen shanemhan...@gmail.com wrote: Sorry if I'm hijacking the conversation, but why in the world would you want to implement a queue on top of Cassandra? It seems like using a proper queuing service would make your life a lot easier. That being said, there might be a better way to play to the strengths of C*. Ideally everything you do is append only with few deletes or updates. So an interesting way to implement a queue might be to do one insert to put the job in the queue and another insert to mark the job as done or in process or whatever. This would also give you the benefit of being able to replay the state of the queue. On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen jan.algermis...@nordsc.com wrote: Chris, thanks for taking a look. On 06 Oct 2014, at 04:44, Chris Lohfink clohf...@blackbirdit.com wrote: It appears you are aware of the tombstones affect that leads people to label this an anti-pattern. Without due or any time based value being part of the partition key means you will still get a lot of buildup. You only have 1 partition per shard which just linearly decreases the tombstones. That isn't likely to be enough to really help in a situation of high queue throughput, especially with the default of 4 shards. Yes, dealing with the tombstones effect is the whole point. The work loads I have to deal with are not really high throughput, it is unlikely we’ll ever reach multiple messages per second.The emphasis is also more on coordinating producer and consumer than on high volume capacity problems. Your comment seems to suggest to include larger time frames (e.g. the due-hour) in the partition keys and use the current time to select the active partitions (e.g. the shards of the hour). Once an hour has passed, the corresponding shards will never be touched again. Am I understanding this correctly? You may want to consider switching to LCS from the default STCS since re-writing to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios, with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput. In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones. Would be hard to identify difference in small tests I think. Thanks, I’ll try to explore the various effects Whats the plan to prevent two consumers from reading same message off of a queue? You mention in docs you will address it at a later point in time but its kinda a biggy. Big lock batch reads like astyanax recipe? I have included a static column per shard to act as a lock (the ’lock’ column in the examples) in combination with conditional updates. I must admit, I have not quite understood what Netfix is doing in terms of coordination - but since performance isn’t our concern, CAS should do fine, I guess(?) Thanks again, Jan --- Chris Lohfink On Oct 5, 2014, at 6:03 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Hi, I have put together some thoughts on realizing simple queues with Cassandra. https://github.com/algermissen/cassandra-ruby-queue The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced. Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took. Jan [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
Re: Exploring Simply Queueing
On Mon, Oct 6, 2014 at 8:30 AM, Minh Do m...@netflix.com wrote: Just let you know if you base your implementation on Netflix's queue recipe, there are many issues with it. In general, we don't advise people to use that recipe so I suggest you to save your time by not going that same route again. I +1 people who are saying that this is not a strong case for Cassandra. I also agree that if you want to do this, you should consider other approaches. However, depending on the nature of the queue (low amount of total volume, etc.) things like this can work just fine in practice : https://engineering.eventbrite.com/replayable-pubsub-queues-with-cassandra-and-zookeeper/ In theory they can also be designed such that history is not infinite, which mitigates the buildup of old queue state. =Rob http://twitter.com/rcolidba
ConnectionException while trying to connect with Astyanax over Java driver
All, I am trying to use the new astyanax over java driver to connect to cassandra version 1.2.12, Following settings are turned on in cassandra.yaml: start_rpc: true native_transport_port: 9042 start_native_transport: true *Code to connect:* final SupplierListHost hostSupplier = new SupplierListHost() { @Override public ListHost get() { ListHost hosts = new ArrayList(); for(String hostPort : StringUtil.getSetFromDelimitedString(seedHosts, ,)) { String[] pair = hostPort.split(:); Host host = new Host(pair[0], Integer.valueOf(pair[1]).intValue()); host.setRack(rack1); hosts.add(host); } return hosts; } }; // get keyspace AstyanaxContextKeyspace context = new AstyanaxContext.Builder() .forCluster(clusterName) .forKeyspace(keyspace) .withHostSupplier(hostSupplier) .withAstyanaxConfiguration( new AstyanaxConfigurationImpl() .setDiscoveryType(NodeDiscoveryType.DISCOVERY_SERVICE) .setDiscoveryDelayInSeconds(6).setCqlVersion(3.0.0).setTargetCassandraVersion(1.2.12) ) .withConnectionPoolConfiguration( new *JavaDriverConfigBuilder*().withPort(9042) .build()) .buildKeyspace(CqlFamilyFactory.getInstance()); context.start(); *Exception in Cassandra Server logs:* WARN [New I/O server boss #1 ([id: 0x6815d6c5, /0.0.0.0:9042])] 2014-10-06 11:11:37,826 Slf4JLogger.java (line 82) Failed to accept a connection. java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.init(IZ)V at org.apache.cassandra.transport.Frame$Decoder.init(Frame.java:147) at org.apache.cassandra.transport.Server$PipelineFactory.getPipeline(Server.java:232) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.registerAcceptedChannel(NioServerSocketPipelineSink.java:276) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:246) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) I also tried using the Java Driver 2.1.1, but I see the NoHostAvailableException, and I feel the underlying reason is the same as during connecting with astyanax java driver.
Re: IN versus multiple asynchronous queries
As far as latency is concerned, it seems like it wouldn't matter very much if the coordinator has to wait for all the responses to come back, or the client waits for all the responses to come back. I’ve got the same latency either way. I would assume that 50 coordinations is more expensive than one coordination that does 50 times the work, but that’s probably insignificant when compared to the actual fetching of the data from the SSTables. I do see the point about putting stress on coordinator memory. In general, the documents will be very small, but there will occasionally be some rather large ones, potentially several megabytes in size. Definitely better to not make the coordinator hold on to that memory while it waits for other requests to come back. Robert On Oct 4, 2014, at 8:34 AM, DuyHai Doan doanduy...@gmail.commailto:doanduy...@gmail.com wrote: Definitely 50 concurrent queries, possibly in async mode. If you're using the IN clause with 50 values, the coordinator will block, waiting for 50 partitions to be fetched from different nodes (worst case = 50 nodes) before responding to client. In addition to the very high latency, you'll put the stress on the coordinator memory. On Sat, Oct 4, 2014 at 3:09 PM, Robert Wille rwi...@fold3.commailto:rwi...@fold3.com wrote: I have a table of small documents (less than 1K) that are often accessed together as a group. The group size is always less than 50. Which produces less load on the server, one query using an IN clause to get all 50 back together, or 50 concurrent queries? Which one is fastest? Thanks Robert
assertion error on joining
Hi all, I'm a bit stuck , i want to expand my cluster C* 2.0.6 but i encountered an error on the new node. ERROR [FlushWriter:2] 2014-10-06 16:15:35,147 CassandraDaemon.java (line 199) Exception in thread Thread[FlushWriter:2,5,main] java.lang.AssertionError: 394920 at org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:342) at org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:201) at org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:188) at org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:133) at org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:202) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:187) ... This assertion is here : public static void writeWithShortLength(ByteBuffer buffer, DataOutput out) throws IOException { int length = buffer.remaining(); -- assert 0 = length length = FBUtilities.MAX_UNSIGNED_SHORT : length; out.writeShort(length); write(buffer, out); // writing data bytes to output source } But i dont know what i can do to complete the bootstrap. Thanks,
Re: ConnectionException while trying to connect with Astyanax over Java driver
java.lang.NoSuchMethodError - Jar dependency issue probably. Did you try to create an issue on the Astyanax github repo ? On Mon, Oct 6, 2014 at 6:01 PM, Ruchir Jha ruchir@gmail.com wrote: All, I am trying to use the new astyanax over java driver to connect to cassandra version 1.2.12, Following settings are turned on in cassandra.yaml: start_rpc: true native_transport_port: 9042 start_native_transport: true *Code to connect:* final SupplierListHost hostSupplier = new SupplierListHost() { @Override public ListHost get() { ListHost hosts = new ArrayList(); for(String hostPort : StringUtil.getSetFromDelimitedString(seedHosts, ,)) { String[] pair = hostPort.split(:); Host host = new Host(pair[0], Integer.valueOf(pair[1]).intValue()); host.setRack(rack1); hosts.add(host); } return hosts; } }; // get keyspace AstyanaxContextKeyspace context = new AstyanaxContext.Builder() .forCluster(clusterName) .forKeyspace(keyspace) .withHostSupplier(hostSupplier) .withAstyanaxConfiguration( new AstyanaxConfigurationImpl() .setDiscoveryType(NodeDiscoveryType.DISCOVERY_SERVICE) .setDiscoveryDelayInSeconds(6).setCqlVersion(3.0.0).setTargetCassandraVersion(1.2.12) ) .withConnectionPoolConfiguration( new *JavaDriverConfigBuilder*().withPort(9042) .build()) .buildKeyspace(CqlFamilyFactory.getInstance()); context.start(); *Exception in Cassandra Server logs:* WARN [New I/O server boss #1 ([id: 0x6815d6c5, /0.0.0.0:9042])] 2014-10-06 11:11:37,826 Slf4JLogger.java (line 82) Failed to accept a connection. java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.init(IZ)V at org.apache.cassandra.transport.Frame$Decoder.init(Frame.java:147) at org.apache.cassandra.transport.Server$PipelineFactory.getPipeline(Server.java:232) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.registerAcceptedChannel(NioServerSocketPipelineSink.java:276) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:246) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) I also tried using the Java Driver 2.1.1, but I see the NoHostAvailableException, and I feel the underlying reason is the same as during connecting with astyanax java driver.
Re: IN versus multiple asynchronous queries
Definitely better to not make the coordinator hold on to that memory while it waits for other requests to come back -- You get it. When loading big documents, you risk starving the heap quickly, triggering long GC cycle on the coordinator etc... On Mon, Oct 6, 2014 at 6:22 PM, Robert Wille rwi...@fold3.com wrote: As far as latency is concerned, it seems like it wouldn't matter very much if the coordinator has to wait for all the responses to come back, or the client waits for all the responses to come back. I’ve got the same latency either way. I would assume that 50 coordinations is more expensive than one coordination that does 50 times the work, but that’s probably insignificant when compared to the actual fetching of the data from the SSTables. I do see the point about putting stress on coordinator memory. In general, the documents will be very small, but there will occasionally be some rather large ones, potentially several megabytes in size. Definitely better to not make the coordinator hold on to that memory while it waits for other requests to come back. Robert On Oct 4, 2014, at 8:34 AM, DuyHai Doan doanduy...@gmail.com wrote: Definitely 50 concurrent queries, possibly in async mode. If you're using the IN clause with 50 values, the coordinator will block, waiting for 50 partitions to be fetched from different nodes (worst case = 50 nodes) before responding to client. In addition to the very high latency, you'll put the stress on the coordinator memory. On Sat, Oct 4, 2014 at 3:09 PM, Robert Wille rwi...@fold3.com wrote: I have a table of small documents (less than 1K) that are often accessed together as a group. The group size is always less than 50. Which produces less load on the server, one query using an IN clause to get all 50 back together, or 50 concurrent queries? Which one is fastest? Thanks Robert
RE: Cassandra Data Model design
You need rethink your data model for client_data table. Unlike RDBMS, Cassandra heavily relies on Primary Key for filtering data. In fact using any column other than primary key is not recommended when you are using Cassandra. This means that how you design your Primary Key is critical. There are two options in this case: 1. Use both client_name and is_valid as Row Key 2. Use client_name as Row Key and is_valid as partitioning key or in other words, make a composite key using client_name and is_valid Cassandra Data Model Rule: You need to know your query patterns before you create a table. Rahul Gupta From: Check Peck [mailto:comptechge...@gmail.com] Sent: Wednesday, September 17, 2014 4:01 PM To: user Subject: Cassandra Data Model design I have recently started working with Cassandra. We have cassandra cluster which is using DSE 4.0 version and has VNODES enabled. We have a tables like this - Below is my first table - CREATE TABLE customers ( customer_id int PRIMARY KEY, last_modified_date timeuuid, customer_value text ) Read query pattern is like this on above table as of now since we need to get everything from above table and load it into our application memory every x minutes. select customer_id, customer_value from datakeyspace.customers; We have second table like this - CREATE TABLE client_data ( client_name text PRIMARY KEY, client_id text, creation_date timestamp, is_valid int, last_modified_date timestamp ) Right now in the above table, we have 500 records and all those records has is_valid column value set as 1. And the read query pattern is like this on above table as of now since we need to get everything from above table and load it into our application memory every x minutes so the below query will return me all 500 records since everything has is_valid set to 1. select client_name, client_id from datakeyspace.client_data where is_valid=1; Since our cluster is VNODES enabled so my above query pattern is not efficient at all and it is taking lot of time to get the data from Cassandra. We are reading from these table with consistency level QUORUM. Is there any possibility of improving our data model? Any suggestions will be greatly appreciated. Click herehttps://www.mailcontrol.com/sr/kSV!iHJdoezGX2PQPOmvUgEBY15Clgt1yZCwVg0S2deEmu+55HoGlTWtq8oOngZ2yx9zvjq!hshkxH4nYzTQYQ== to report this email as spam. This e-mail and the information, including any attachments it contains, are intended to be a confidential communication only to the person or entity to whom it is addressed and may contain information that is privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender and destroy the original message. Thank you. Please consider the environment before printing this email.
Re: ConnectionException while trying to connect with Astyanax over Java driver
That exception is on the cassandra server and not on the client. On Mon, Oct 6, 2014 at 2:10 PM, DuyHai Doan doanduy...@gmail.com wrote: java.lang.NoSuchMethodError - Jar dependency issue probably. Did you try to create an issue on the Astyanax github repo ? On Mon, Oct 6, 2014 at 6:01 PM, Ruchir Jha ruchir@gmail.com wrote: All, I am trying to use the new astyanax over java driver to connect to cassandra version 1.2.12, Following settings are turned on in cassandra.yaml: start_rpc: true native_transport_port: 9042 start_native_transport: true *Code to connect:* final SupplierListHost hostSupplier = new SupplierListHost() { @Override public ListHost get() { ListHost hosts = new ArrayList(); for(String hostPort : StringUtil.getSetFromDelimitedString(seedHosts, ,)) { String[] pair = hostPort.split(:); Host host = new Host(pair[0], Integer.valueOf(pair[1]).intValue()); host.setRack(rack1); hosts.add(host); } return hosts; } }; // get keyspace AstyanaxContextKeyspace context = new AstyanaxContext.Builder() .forCluster(clusterName) .forKeyspace(keyspace) .withHostSupplier(hostSupplier) .withAstyanaxConfiguration( new AstyanaxConfigurationImpl() .setDiscoveryType(NodeDiscoveryType.DISCOVERY_SERVICE) .setDiscoveryDelayInSeconds(6).setCqlVersion(3.0.0).setTargetCassandraVersion(1.2.12) ) .withConnectionPoolConfiguration( new *JavaDriverConfigBuilder*().withPort(9042) .build()) .buildKeyspace(CqlFamilyFactory.getInstance()); context.start(); *Exception in Cassandra Server logs:* WARN [New I/O server boss #1 ([id: 0x6815d6c5, /0.0.0.0:9042])] 2014-10-06 11:11:37,826 Slf4JLogger.java (line 82) Failed to accept a connection. java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.init(IZ)V at org.apache.cassandra.transport.Frame$Decoder.init(Frame.java:147) at org.apache.cassandra.transport.Server$PipelineFactory.getPipeline(Server.java:232) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.registerAcceptedChannel(NioServerSocketPipelineSink.java:276) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:246) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) I also tried using the Java Driver 2.1.1, but I see the NoHostAvailableException, and I feel the underlying reason is the same as during connecting with astyanax java driver.
Re: Exploring Simply Queueing
Shane, On 06 Oct 2014, at 16:34, Shane Hansen shanemhan...@gmail.com wrote: Sorry if I'm hijacking the conversation, but why in the world would you want to implement a queue on top of Cassandra? It seems like using a proper queuing service would make your life a lot easier. Agreed - however, the use case simply does not justify the additional operations. That being said, there might be a better way to play to the strengths of C*. Ideally everything you do is append only with few deletes or updates. So an interesting way to implement a queue might be to do one insert to put the job in the queue and another insert to mark the job as done or in process or whatever. This would also give you the benefit of being able to replay the state of the queue. Thanks, I’ll try that, too. Jan On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen jan.algermis...@nordsc.com wrote: Chris, thanks for taking a look. On 06 Oct 2014, at 04:44, Chris Lohfink clohf...@blackbirdit.com wrote: It appears you are aware of the tombstones affect that leads people to label this an anti-pattern. Without due or any time based value being part of the partition key means you will still get a lot of buildup. You only have 1 partition per shard which just linearly decreases the tombstones. That isn't likely to be enough to really help in a situation of high queue throughput, especially with the default of 4 shards. Yes, dealing with the tombstones effect is the whole point. The work loads I have to deal with are not really high throughput, it is unlikely we’ll ever reach multiple messages per second.The emphasis is also more on coordinating producer and consumer than on high volume capacity problems. Your comment seems to suggest to include larger time frames (e.g. the due-hour) in the partition keys and use the current time to select the active partitions (e.g. the shards of the hour). Once an hour has passed, the corresponding shards will never be touched again. Am I understanding this correctly? You may want to consider switching to LCS from the default STCS since re-writing to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios, with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput. In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones. Would be hard to identify difference in small tests I think. Thanks, I’ll try to explore the various effects Whats the plan to prevent two consumers from reading same message off of a queue? You mention in docs you will address it at a later point in time but its kinda a biggy. Big lock batch reads like astyanax recipe? I have included a static column per shard to act as a lock (the ’lock’ column in the examples) in combination with conditional updates. I must admit, I have not quite understood what Netfix is doing in terms of coordination - but since performance isn’t our concern, CAS should do fine, I guess(?) Thanks again, Jan --- Chris Lohfink On Oct 5, 2014, at 6:03 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Hi, I have put together some thoughts on realizing simple queues with Cassandra. https://github.com/algermissen/cassandra-ruby-queue The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced. Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took. Jan [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
Re: Exploring Simply Queueing
Robert, On 06 Oct 2014, at 17:50, Robert Coli rc...@eventbrite.com wrote: In theory they can also be designed such that history is not infinite, which mitigates the buildup of old queue state. Hmm, I was under the impression that issues with old queue state disappear after gc_grace_seconds and that the goal primarily is to keep the rows ‘short’ enough to achieve a tombstones read performance impact that one can live with in a given use case. Is that understanding wrong? Jan
Re: Exploring Simply Queueing
i want answer the first question why one might use cassandra as a queuing solution: - its the only opensource distributed persistence layer (i.e. no SPOF), that you can run over WAN and provide lan/wan specific quorum controls i know its sub optimal, as the deletion imposes additional compaction/repair penalties, but there no other solution i am awaee of. Most AMQP solutions are broker based and clustering is pain, while things like riak only supports wan based cluster in their commercial solution. I would love to know about other alternatives, And thaks for sharing the ruby based priority queue prototype, it helps people like me (sys ad :-) ) exploring these concepts betrter, cheers ranjib On Mon, Oct 6, 2014 at 1:35 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Shane, On 06 Oct 2014, at 16:34, Shane Hansen shanemhan...@gmail.com wrote: Sorry if I'm hijacking the conversation, but why in the world would you want to implement a queue on top of Cassandra? It seems like using a proper queuing service would make your life a lot easier. Agreed - however, the use case simply does not justify the additional operations. That being said, there might be a better way to play to the strengths of C*. Ideally everything you do is append only with few deletes or updates. So an interesting way to implement a queue might be to do one insert to put the job in the queue and another insert to mark the job as done or in process or whatever. This would also give you the benefit of being able to replay the state of the queue. Thanks, I’ll try that, too. Jan On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen jan.algermis...@nordsc.com wrote: Chris, thanks for taking a look. On 06 Oct 2014, at 04:44, Chris Lohfink clohf...@blackbirdit.com wrote: It appears you are aware of the tombstones affect that leads people to label this an anti-pattern. Without due or any time based value being part of the partition key means you will still get a lot of buildup. You only have 1 partition per shard which just linearly decreases the tombstones. That isn't likely to be enough to really help in a situation of high queue throughput, especially with the default of 4 shards. Yes, dealing with the tombstones effect is the whole point. The work loads I have to deal with are not really high throughput, it is unlikely we’ll ever reach multiple messages per second.The emphasis is also more on coordinating producer and consumer than on high volume capacity problems. Your comment seems to suggest to include larger time frames (e.g. the due-hour) in the partition keys and use the current time to select the active partitions (e.g. the shards of the hour). Once an hour has passed, the corresponding shards will never be touched again. Am I understanding this correctly? You may want to consider switching to LCS from the default STCS since re-writing to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios, with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput. In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones. Would be hard to identify difference in small tests I think. Thanks, I’ll try to explore the various effects Whats the plan to prevent two consumers from reading same message off of a queue? You mention in docs you will address it at a later point in time but its kinda a biggy. Big lock batch reads like astyanax recipe? I have included a static column per shard to act as a lock (the ’lock’ column in the examples) in combination with conditional updates. I must admit, I have not quite understood what Netfix is doing in terms of coordination - but since performance isn’t our concern, CAS should do fine, I guess(?) Thanks again, Jan --- Chris Lohfink On Oct 5, 2014, at 6:03 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Hi, I have put together some thoughts on realizing simple queues with Cassandra. https://github.com/algermissen/cassandra-ruby-queue The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced. Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took. Jan [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
Bitmaps
Hi Guys, what data type recommend to store bitmaps? I am planning to store maps of 90,000,000 length and then query by key. Example: key : 22_ES bitmap : 10101101010111010101011 Thanks Eduardo
Re: Exploring Simply Queueing
On Mon, Oct 6, 2014 at 1:40 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Hmm, I was under the impression that issues with old queue state disappear after gc_grace_seconds and that the goal primarily is to keep the rows ‘short’ enough to achieve a tombstones read performance impact that one can live with in a given use case. The design I pasted does a link to does not include specifics regarding pruning old history. Yes, you can just delete it, if your system design doesn't require replay from the start. =Rob
Re: Bitmaps
I highly recommend against storing data structures like this in C*. That really isn't it's sweet spot. For instance, if you were to use the blob type which will give you the smallest size, you are still looking at a cell size of (90,000,000/8/1024) = 10,986 or over 10MB in size, which is prohibitively large. Additionally, there is no way to modify the bitmap in place, you would have to read the entire structure out and write it back in. You could store one bit per cell, but that would essentially defeat the purpose of the bitmap's compact size. On Mon, Oct 6, 2014 at 4:46 PM, Eduardo Cusa eduardo.c...@usmediaconsulting.com wrote: Hi Guys, what data type recommend to store bitmaps? I am planning to store maps of 90,000,000 length and then query by key. Example: key : 22_ES bitmap : 10101101010111010101011 Thanks Eduardo
Re: Indexes Fragmentation
On Fri, Oct 3, 2014 at 6:03 PM, Arthur Zubarev arthur.zuba...@aol.com wrote: I now see I had misspelled the word tall for toll, anyways, if I understood correctly, your reply implies there is no impact whatsoever and there is no need to defrug indexes of the frequently changing columns. Cases with lots of secondary indexes which have a lot of churn are not well suited for a database with immutable datafiles which wants to be accessed by Primary Key. The fragmentation is really bad, because the data files are immutable and you have a lot of churn. Probably don't do it? =Rob
Re: Bitmaps
Isn't there a video of Ooyala at some past Cassandra Summit demonstrating usage of Cassandra for text search using Trigram ? AFAIK they were storing kind of bitmap to perform OR AND operations on trigram On Mon, Oct 6, 2014 at 10:53 PM, Russell Bradberry rbradbe...@gmail.com wrote: I highly recommend against storing data structures like this in C*. That really isn't it's sweet spot. For instance, if you were to use the blob type which will give you the smallest size, you are still looking at a cell size of (90,000,000/8/1024) = 10,986 or over 10MB in size, which is prohibitively large. Additionally, there is no way to modify the bitmap in place, you would have to read the entire structure out and write it back in. You could store one bit per cell, but that would essentially defeat the purpose of the bitmap's compact size. On Mon, Oct 6, 2014 at 4:46 PM, Eduardo Cusa eduardo.c...@usmediaconsulting.com wrote: Hi Guys, what data type recommend to store bitmaps? I am planning to store maps of 90,000,000 length and then query by key. Example: key : 22_ES bitmap : 10101101010111010101011 Thanks Eduardo
Re: Bitmaps
You certainly have plenty of freedom to trade off size vs access granularity using multiple blobs. It really depends on how mutable the data is, how you intend to read it, whether it is highly sparse and or highly dense (in which case you perhaps don’t need to store every bit) etc. On Oct 6, 2014, at 3:56 PM, DuyHai Doan doanduy...@gmail.com wrote: Isn't there a video of Ooyala at some past Cassandra Summit demonstrating usage of Cassandra for text search using Trigram ? AFAIK they were storing kind of bitmap to perform OR AND operations on trigram On Mon, Oct 6, 2014 at 10:53 PM, Russell Bradberry rbradbe...@gmail.com wrote: I highly recommend against storing data structures like this in C*. That really isn't it's sweet spot. For instance, if you were to use the blob type which will give you the smallest size, you are still looking at a cell size of (90,000,000/8/1024) = 10,986 or over 10MB in size, which is prohibitively large. Additionally, there is no way to modify the bitmap in place, you would have to read the entire structure out and write it back in. You could store one bit per cell, but that would essentially defeat the purpose of the bitmap's compact size. On Mon, Oct 6, 2014 at 4:46 PM, Eduardo Cusa eduardo.c...@usmediaconsulting.com wrote: Hi Guys, what data type recommend to store bitmaps? I am planning to store maps of 90,000,000 length and then query by key. Example: key : 22_ES bitmap : 10101101010111010101011 Thanks Eduardo smime.p7s Description: S/MIME cryptographic signature
Re: Bitmaps
On Mon, Oct 6, 2014 at 1:56 PM, DuyHai Doan doanduy...@gmail.com wrote: Isn't there a video of Ooyala at some past Cassandra Summit demonstrating usage of Cassandra for text search using Trigram ? AFAIK they were storing kind of bitmap to perform OR AND operations on trigram That sounds like the talk Matt Stump gave at the 2013 SF Summit. Video: https://www.youtube.com/watch?v=E92u4FXGiAM Slides: http://www.slideshare.net/planetcassandra/1-matt-stump
Re: Bitmaps
Yes this one, not Ooyala sorry. Very inventive usage of C* indeed. Thanks for the links On Mon, Oct 6, 2014 at 11:01 PM, Peter Sanford psanf...@retailnext.net wrote: On Mon, Oct 6, 2014 at 1:56 PM, DuyHai Doan doanduy...@gmail.com wrote: Isn't there a video of Ooyala at some past Cassandra Summit demonstrating usage of Cassandra for text search using Trigram ? AFAIK they were storing kind of bitmap to perform OR AND operations on trigram That sounds like the talk Matt Stump gave at the 2013 SF Summit. Video: https://www.youtube.com/watch?v=E92u4FXGiAM Slides: http://www.slideshare.net/planetcassandra/1-matt-stump
Dynamic schema modification an anti-pattern?
There is a team at my work building a entity-attribute-value (EAV) store using Cassandra. There is a column family, called Entity, where the partition key is the UUID of the entity, and the columns are the attributes names with their values. Each entity will contain hundreds to thousands of attributes, out of a list of up to potentially ten thousand known attribute names. However, instead of using wide rows with dynamic columns (and serializing type info with the value), they are trying to use a static column family and modifying the schema dynamically as new named attributes are created. (I believe one of the main drivers of this approach is to use collection columns for certain attributes, and perhaps to preserve type metadata for a given attribute.) This approach goes against everything I've seen and done in Cassandra, and is generally an anti-pattern for most persistence stores, but I want to gather feedback before taking the next step with the team. Do others consider this approach an anti-pattern, and if so, what are the practical downsides? For one, this means that the Entity schema would contain the superset of all columns for all rows. What is the impact of having thousands of columns names in the schema? And what are the implications of modifying the schema dynamically on a decent sized cluster (5 nodes now, growing to 10s later) under load? Thanks, Todd