[RELEASE] Achilles 3.0.4
Hello all We are happy to announce the release of Achilles 3.0.4. Among the biggest changes: - support for static columns: http://goo.gl/o7D5yo - dynamic statements logging tracing at runtime: http://goo.gl/w4jlqZ - SchemaBuilder, the mirror of QueryBuilder for creating schema programmatically: http://goo.gl/DspJQq Link to the changelog: http://goo.gl/tKqpFT Regards Duy Hai DOAN
Re: keyspace with hundreds of columnfamilies
Tommaso, looking at your description of the architecture the idea came up. You can perform sharding on cassandra client and write to different cassandra clusters to keep the number of column families reasonable. With best regards, Ilya On Thu, Jul 3, 2014 at 10:55 PM, tommaso barbugli tbarbu...@gmail.com wrote: thank you for the replies; I am rethinking the schema design, one possible solution is to implode one dimension and get N times less CFs. With this approach I would come up with (cql) tables with up to 100 columns; would that be a problem? Thank You, Tommaso 2014-07-02 23:43 GMT+02:00 Jack Krupansky j...@basetechnology.com: The official answer, engraved in stone tablets, and carried down from the mountain: “Although having more than dozens or hundreds of tables defined is almost certainly a Bad Idea (just as it is a design smell in a relational database), it's relatively straightforward to allow disabling the SlabAllocator.” Emphasis on “almost certainly a Bad Idea.” See: https://issues.apache.org/jira/browse/CASSANDRA-5935 “Allow disabling slab allocation” IOW, this is considered an anti-pattern, but... -- Jack Krupansky *From:* tommaso barbugli tbarbu...@gmail.com *Sent:* Wednesday, July 2, 2014 2:16 PM *To:* user@cassandra.apache.org *Subject:* Re: keyspace with hundreds of columnfamilies Hi, thank you for you replies on this; regarding the arena memory is this a fixed memory allocation or is some sort of in memory caching? I ask because I think that a substantial portion of the column families created will not be queried that frequently (and some will become inactive and stay like that really long time) Thank you, Tommaso 2014-07-02 18:35 GMT+02:00 Romain HARDOUIN romain.hardo...@urssaf.fr: Arena allocation is an improvement feature, not a limitation. It was introduced in Cassandra 1.0 in order to lower memory fragmentation (and therefore promotion failure). AFAIK It's not intended to be tweaked so it might not be a good idea to change it. Best, Romain tommaso barbugli tbarbu...@gmail.com a écrit sur 02/07/2014 17:40:18 : De : tommaso barbugli tbarbu...@gmail.com A : user@cassandra.apache.org, Date : 02/07/2014 17:40 Objet : Re: keyspace with hundreds of columnfamilies 1MB per column family sounds pretty bad to me; is this something I can tweak/workaround somehow? Thanks Tommaso 2014-07-02 17:21 GMT+02:00 Romain HARDOUIN romain.hardo...@urssaf.fr : The trap is that each CF will consume 1 MB of memory due to arena allocation. This might seem harmless but if you plan thousands of CF it means thousands of mega bytes... Up to 1,000 CF I think it could be doable, but not 10,000. Best, Romain tommaso barbugli tbarbu...@gmail.com a écrit sur 02/07/2014 10:13:41 : De : tommaso barbugli tbarbu...@gmail.com A : user@cassandra.apache.org, Date : 02/07/2014 10:14 Objet : keyspace with hundreds of columnfamilies Hi, Are there any known issues, shortcomings about organising data in hundreds of column families? At this present I am running with 300 column families but I expect that to get to a couple of thousands. Is this something discouraged / unsupported (I am using Cassandra 2.0). Thanks Tommaso
Re: Multi-column range scans
Sorry, I've just checked, the correct query should be: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and (interval_id,skill_level) (140235990,11) On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com wrote: Hello Mathew Since Cassandra 2.0.6 it is possible to query over composites: https://issues.apache.org/jira/browse/CASSANDRA-4851 For your example: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and interval_id 140235990; On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, We have a roll-up table that as follows. CREATE TABLE SKILL_COUNT ( skill text, interval_id bigint, skill_level int, skill_count int, PRIMARY KEY (skill, interval_id, skill_level)); Essentially, skill = a names skill i.e. Complaints interval_id = a rounded epoch time (15 minute intervals) skill_level = a number/rating from 1-10 skill_count = the number of people with the specified skill, with the specified skill level, logged in at the interval_id We'd like to run the following query against it select * from skill_count where skill='Complaints' and interval_id = 140235930 and interval_id 140235990 and skill_level = 5; to get a count of people with the relevant skill and level at the appropriate time. However I am getting the following message. Bad Request: PRIMARY KEY part skill_level cannot be restricted (preceding part interval_id is either not restricted or by a non-EQ relation) Looking at how the data is stored ... --- RowKey: Complaints = (name=140235930:2:, value=, timestamp=1405308260403000) = (name=140235930:2:skill_count, value=000a, timestamp=1405308260403000) = (name=140235930:5:, value=, timestamp=1405308260403001) = (name=140235930:5:skill_count, value=0014, timestamp=1405308260403001) = (name=140235930:8:, value=, timestamp=1405308260419000) = (name=140235930:8:skill_count, value=001e, timestamp=1405308260419000) = (name=140235930:10:, value=, timestamp=1405308260419001) = (name=140235930:10:skill_count, value=0001, timestamp=1405308260419001) Should cassandra be able to allow for an extra level of filtering ? or is this something that should be performed from within the application. We have a solution working in Oracle, but would like to store this data in Cassandra, as all the other data that this solution relies on already sits within Cassandra. Appreciate any guidance on this matter. Matt
Re: Multi-column range scans
or : select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and (interval_id) (140235990) Strange enough, when starting using tuple notation you'll need to stick to it even if there is only one element in the tuple On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan doanduy...@gmail.com wrote: Sorry, I've just checked, the correct query should be: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and (interval_id,skill_level) (140235990,11) On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com wrote: Hello Mathew Since Cassandra 2.0.6 it is possible to query over composites: https://issues.apache.org/jira/browse/CASSANDRA-4851 For your example: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and interval_id 140235990; On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, We have a roll-up table that as follows. CREATE TABLE SKILL_COUNT ( skill text, interval_id bigint, skill_level int, skill_count int, PRIMARY KEY (skill, interval_id, skill_level)); Essentially, skill = a names skill i.e. Complaints interval_id = a rounded epoch time (15 minute intervals) skill_level = a number/rating from 1-10 skill_count = the number of people with the specified skill, with the specified skill level, logged in at the interval_id We'd like to run the following query against it select * from skill_count where skill='Complaints' and interval_id = 140235930 and interval_id 140235990 and skill_level = 5; to get a count of people with the relevant skill and level at the appropriate time. However I am getting the following message. Bad Request: PRIMARY KEY part skill_level cannot be restricted (preceding part interval_id is either not restricted or by a non-EQ relation) Looking at how the data is stored ... --- RowKey: Complaints = (name=140235930:2:, value=, timestamp=1405308260403000) = (name=140235930:2:skill_count, value=000a, timestamp=1405308260403000) = (name=140235930:5:, value=, timestamp=1405308260403001) = (name=140235930:5:skill_count, value=0014, timestamp=1405308260403001) = (name=140235930:8:, value=, timestamp=1405308260419000) = (name=140235930:8:skill_count, value=001e, timestamp=1405308260419000) = (name=140235930:10:, value=, timestamp=1405308260419001) = (name=140235930:10:skill_count, value=0001, timestamp=1405308260419001) Should cassandra be able to allow for an extra level of filtering ? or is this something that should be performed from within the application. We have a solution working in Oracle, but would like to store this data in Cassandra, as all the other data that this solution relies on already sits within Cassandra. Appreciate any guidance on this matter. Matt
Re: Multi-column range scans
I don't think your query is doing what he wants. Your query will correctly set the starting point, but will also return larger interval_id's but with lower skill_levels: cqlsh:test select * from skill_count where skill='Complaints' and (interval_id, skill_level) = (140235930, 5); skill | interval_id | skill_level | skill_count +---+-+- Complaints | 140235930 | 5 | 20 Complaints | 140235930 | 8 | 30 Complaints | 140235930 | 10 | 1 Complaints | 140235940 | 2 | 10 Complaints | 140235940 | 8 | 30 (5 rows) cqlsh:test select * from skill_count where skill='Complaints' and (interval_id, skill_level) = (140235930, 5) and (interval_id) (140235990); skill | interval_id | skill_level | skill_count +---+-+- Complaints | 140235930 | 5 | 20 - desired Complaints | 140235930 | 8 | 30 - desired Complaints | 140235930 | 10 | 1 - desired Complaints | 140235940 | 2 | 10 - SKIP Complaints | 140235940 | 8 | 30 - desired The query results in a discontinuous range slice so isn't supported -- Essentially, the client will have to read the entire range and perform client-side filtering. Whether this is efficient depends on the cardinality of skill_level. I tried playing with the allow filtering cql clause, but it would appear from the documentation it's very restrictive... On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan doanduy...@gmail.com wrote: or : select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and (interval_id) (140235990) Strange enough, when starting using tuple notation you'll need to stick to it even if there is only one element in the tuple On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan doanduy...@gmail.com wrote: Sorry, I've just checked, the correct query should be: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and (interval_id,skill_level) (140235990,11) On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com wrote: Hello Mathew Since Cassandra 2.0.6 it is possible to query over composites: https://issues.apache.org/jira/browse/CASSANDRA-4851 For your example: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and interval_id 140235990; On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, We have a roll-up table that as follows. CREATE TABLE SKILL_COUNT ( skill text, interval_id bigint, skill_level int, skill_count int, PRIMARY KEY (skill, interval_id, skill_level)); Essentially, skill = a names skill i.e. Complaints interval_id = a rounded epoch time (15 minute intervals) skill_level = a number/rating from 1-10 skill_count = the number of people with the specified skill, with the specified skill level, logged in at the interval_id We'd like to run the following query against it select * from skill_count where skill='Complaints' and interval_id = 140235930 and interval_id 140235990 and skill_level = 5; to get a count of people with the relevant skill and level at the appropriate time. However I am getting the following message. Bad Request: PRIMARY KEY part skill_level cannot be restricted (preceding part interval_id is either not restricted or by a non-EQ relation) Looking at how the data is stored ... --- RowKey: Complaints = (name=140235930:2:, value=, timestamp=1405308260403000) = (name=140235930:2:skill_count, value=000a, timestamp=1405308260403000) = (name=140235930:5:, value=, timestamp=1405308260403001) = (name=140235930:5:skill_count, value=0014, timestamp=1405308260403001) = (name=140235930:8:, value=, timestamp=1405308260419000) = (name=140235930:8:skill_count, value=001e, timestamp=1405308260419000) = (name=140235930:10:, value=, timestamp=1405308260419001) = (name=140235930:10:skill_count, value=0001, timestamp=1405308260419001) Should cassandra be able to allow for an extra level of filtering ? or is this something that should be performed from within the application. We have a solution working in Oracle, but would like to store this data in Cassandra, as all the other data that this solution relies on already sits within Cassandra. Appreciate any guidance on this matter. Matt -- *Ken Hancock *| System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC http://www.schange.com/en-US/Company/InvestorRelations.aspx Office: +1 (978)
Re: Multi-column range scans
Exact Ken, I get bitten again by the semantics of composite tuples. This kind of query won't be possible until something like wide row end slice predicate is available ( https://issues.apache.org/jira/browse/CASSANDRA-6167), if it will one day On Mon, Jul 14, 2014 at 5:02 PM, Ken Hancock ken.hanc...@schange.com wrote: I don't think your query is doing what he wants. Your query will correctly set the starting point, but will also return larger interval_id's but with lower skill_levels: cqlsh:test select * from skill_count where skill='Complaints' and (interval_id, skill_level) = (140235930, 5); skill | interval_id | skill_level | skill_count +---+-+- Complaints | 140235930 | 5 | 20 Complaints | 140235930 | 8 | 30 Complaints | 140235930 | 10 | 1 Complaints | 140235940 | 2 | 10 Complaints | 140235940 | 8 | 30 (5 rows) cqlsh:test select * from skill_count where skill='Complaints' and (interval_id, skill_level) = (140235930, 5) and (interval_id) (140235990); skill | interval_id | skill_level | skill_count +---+-+- Complaints | 140235930 | 5 | 20 - desired Complaints | 140235930 | 8 | 30 - desired Complaints | 140235930 | 10 | 1 - desired Complaints | 140235940 | 2 | 10 - SKIP Complaints | 140235940 | 8 | 30 - desired The query results in a discontinuous range slice so isn't supported -- Essentially, the client will have to read the entire range and perform client-side filtering. Whether this is efficient depends on the cardinality of skill_level. I tried playing with the allow filtering cql clause, but it would appear from the documentation it's very restrictive... On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan doanduy...@gmail.com wrote: or : select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and (interval_id) (140235990) Strange enough, when starting using tuple notation you'll need to stick to it even if there is only one element in the tuple On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan doanduy...@gmail.com wrote: Sorry, I've just checked, the correct query should be: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and (interval_id,skill_level) (140235990,11) On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com wrote: Hello Mathew Since Cassandra 2.0.6 it is possible to query over composites: https://issues.apache.org/jira/browse/CASSANDRA-4851 For your example: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and interval_id 140235990; On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, We have a roll-up table that as follows. CREATE TABLE SKILL_COUNT ( skill text, interval_id bigint, skill_level int, skill_count int, PRIMARY KEY (skill, interval_id, skill_level)); Essentially, skill = a names skill i.e. Complaints interval_id = a rounded epoch time (15 minute intervals) skill_level = a number/rating from 1-10 skill_count = the number of people with the specified skill, with the specified skill level, logged in at the interval_id We'd like to run the following query against it select * from skill_count where skill='Complaints' and interval_id = 140235930 and interval_id 140235990 and skill_level = 5; to get a count of people with the relevant skill and level at the appropriate time. However I am getting the following message. Bad Request: PRIMARY KEY part skill_level cannot be restricted (preceding part interval_id is either not restricted or by a non-EQ relation) Looking at how the data is stored ... --- RowKey: Complaints = (name=140235930:2:, value=, timestamp=1405308260403000) = (name=140235930:2:skill_count, value=000a, timestamp=1405308260403000) = (name=140235930:5:, value=, timestamp=1405308260403001) = (name=140235930:5:skill_count, value=0014, timestamp=1405308260403001) = (name=140235930:8:, value=, timestamp=1405308260419000) = (name=140235930:8:skill_count, value=001e, timestamp=1405308260419000) = (name=140235930:10:, value=, timestamp=1405308260419001) = (name=140235930:10:skill_count, value=0001, timestamp=1405308260419001) Should cassandra be able to allow for an extra level of filtering ? or is this something that should be performed from within the application. We have a solution working in Oracle, but would like to store this data in Cassandra, as all the other data that this
Upgrading from 1.1.9 to 1.2.18
Hello All, I'm trying to upgrade from a 3 node 1.1.9 cluster to a 6 node 1.2.18 cluster on ubuntu. Can sstableloader be used to stream from the existing cluster to the new cluster? If so, what that the suggested method? I keep getting the following when trying this: partitioner org.apache.cassandra.dht.RandomPartitioner does not match system partitioner org.apache.cassandra.dht.Murmur3Partitioner. Note that the default partitioner starting with Cassandra 1.2 is Murmur3Partitioner, so you will need to edit that to match your old partitioner if upgrading. It would appear that 1.1.9 doesn't have Murmur3Partitioner though, so I changed the partitioner on the new cluster to RandomPartitioner. Even with that, I get the following error: CLASSPATH=/etc/cassandra/conf/cassandra.yaml:/root/lib_cass15/apache-cassandra-1.2.18.jar:/root/lib_cass15/guava-13.0.1.jar:/etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-1.1.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-1.1.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-1.1.9.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.7.0.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/lib/stress.jar Could not retrieve endpoint ranges: java.lang.RuntimeException: Could not retrieve endpoint ranges: at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:233) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:119) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:67) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_ring(Cassandra.java:1155) at org.apache.cassandra.thrift.Cassandra$Client.describe_ring(Cassandra.java:1142) at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:212) ... 2 more Is there a way to get sstableloader to work? If not, can someone point me to documentation explaining other ways to migrate the data/keyspaces? I haven't been able to find any detailed docs. Thank you If you received this message and have reason to believe the sender did not intend to direct it to you, please notify the sender immediately by e-mail and delete the message from your system. This message (including any attachments) may contain confidential and/or proprietary information that should be read only by certain individuals. As a result, any unauthorized disclosure, copying, or distribution of this e-mail and the information contained herein is strictly prohibited and may constitute a violation of law. If you have any questions about this e-mail please notify the sender immediately.
Re: UnavailableException
Mark, Here you go: *NodeTool status:* Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.10.20.15 1.62 TB256 8.1% 01a01f07-4df2-4c87-98e9-8dd38b3e4aee rack1 UN 10.10.20.19 1.66 TB256 8.3% 30ddf003-4d59-4a3e-85fa-e94e4adba1cb rack1 UN 10.10.20.35 1.62 TB256 9.0% 17cb8772-2444-46ff-8525-33746514727d rack1 UN 10.10.20.31 1.64 TB256 8.3% 1435acf9-c64d-4bcd-b6a4-abcec209815e rack1 UN 10.10.20.52 1.59 TB256 9.1% 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e rack1 UN 10.10.20.27 1.66 TB256 7.7% 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 UN 10.10.20.22 1.66 TB256 8.9% 46af9664-8975-4c91-847f-3f7b8f8d5ce2 rack1 UN 10.10.20.39 1.68 TB256 8.0% b7d44c26-4d75-4d36-a779-b7e7bdaecbc9 rack1 UN 10.10.20.45 1.49 TB256 7.7% 8d6bce33-8179-4660-8443-2cf822074ca4 rack1 UN 10.10.20.47 1.64 TB256 7.9% bcd51a92-3150-41ae-9c51-104ea154f6fa rack1 UN 10.10.20.62 1.59 TB256 8.2% 84b47313-da75-4519-94f3-3951d554a3e5 rack1 UN 10.10.20.51 1.66 TB256 8.9% 0343cd58-3686-465f-8280-56fb72d161e2 rack1 *Astyanax Connection Settings:* seeds :12 maxConns :16 maxConnsPerHost:16 connectTimeout :2000 socketTimeout :6 maxTimeoutCount:16 maxBlockedThreadsPerHost:16 maxOperationsPerConnection:16 DiscoveryType: RING_DESCRIBE ConnectionPoolType: TOKEN_AWARE DefaultReadConsistencyLevel: CL_QUORUM DefaultWriteConsistencyLevel: CL_QUORUM On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy mark.re...@boxever.com wrote: Can you post the output of nodetool status and your Astyanax connection settings? On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha ruchir@gmail.com wrote: This is how we create our keyspace. We just ran this command once through a cqlsh session on one of the nodes, so don't quite understand what you mean by check that your DC names match up CREATE KEYSPACE prod WITH replication = { 'class': 'NetworkTopologyStrategy', 'datacenter1': '3' }; On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink clohf...@blackbirdit.com wrote: What replication strategy are you using? if using NetworkTopolgyStrategy double check that your DC names match up (case sensitive) Chris On Jul 11, 2014, at 9:38 AM, Ruchir Jha ruchir@gmail.com wrote: Here's the complete stack trace: com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), attempts=3]UnavailableException() at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123) Caused by: UnavailableException() at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60) ... 12 more On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav ipremya...@gmail.com wrote: Please post the full exception. On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha ruchir@gmail.com wrote: We have a 12 node cluster and we are consistently seeing this exception being thrown during peak write traffic. We have a replication factor of 3 and a write consistency level of QUORUM. Also note there is no unusual Or Full GC activity during this time. Appreciate any help. Sent from my iPhone
Re: Cassandra use cases/Strengths/Weakness
We've struggled getting consistent write latency linear write scalability with a pretty heavy insert load (1000's of records/second), and our records are about 1k-2k of data (mix of integer/string columns and a blob). Wondering if you have any rough numbers for your small to medium write sizes experience? On 07/04/2014 01:58 PM, James Horey wrote: ... * Low write latency with respect to small to medium write sizes (logs, sensor data, etc.) * Linear write scalability * ...
Re: Upgrading from 1.1.9 to 1.2.18
On Mon, Jul 14, 2014 at 9:54 AM, Denning, Michael michael.denn...@kavokerrgroup.com wrote: I'm trying to upgrade from a 3 node 1.1.9 cluster to a 6 node 1.2.18 cluster on ubuntu. Can sstableloader be used to stream from the existing cluster to the new cluster? If so, what that the suggested method? I keep getting the following when trying this: http://palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra One of the caveats mentioned there is that sstableloader often does not work between major versions. If I were you, I would accomplish this task by dividing it in two : 1) Upgrade my 3 node cluster from 1.1.9 to 1.2.18 via rolling restart/upgradesstables. 2) Expand 3 node cluster to 6 nodes Is there a reason you are not using this process? =Rob
RE: COMMERCIAL:Re: Upgrading from 1.1.9 to 1.2.18
3 node cluster is in production. It’d difficult for me to get sign off on the change control to upgrade it. The 6 node cluster is already stood up (in aws). In an ideal scenario I’d just be able to bring the data over to the new cluster. From: Robert Coli [mailto:rc...@eventbrite.com] Sent: Monday, July 14, 2014 1:53 PM To: user@cassandra.apache.org Subject: COMMERCIAL:Re: Upgrading from 1.1.9 to 1.2.18 On Mon, Jul 14, 2014 at 9:54 AM, Denning, Michael michael.denn...@kavokerrgroup.commailto:michael.denn...@kavokerrgroup.com wrote: I'm trying to upgrade from a 3 node 1.1.9 cluster to a 6 node 1.2.18 cluster on ubuntu. Can sstableloader be used to stream from the existing cluster to the new cluster? If so, what that the suggested method? I keep getting the following when trying this: http://palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra One of the caveats mentioned there is that sstableloader often does not work between major versions. If I were you, I would accomplish this task by dividing it in two : 1) Upgrade my 3 node cluster from 1.1.9 to 1.2.18 via rolling restart/upgradesstables. 2) Expand 3 node cluster to 6 nodes Is there a reason you are not using this process? =Rob If you received this message and have reason to believe the sender did not intend to direct it to you, please notify the sender immediately by e-mail and delete the message from your system. This message (including any attachments) may contain confidential and/or proprietary information that should be read only by certain individuals. As a result, any unauthorized disclosure, copying, or distribution of this e-mail and the information contained herein is strictly prohibited and may constitute a violation of law. If you have any questions about this e-mail please notify the sender immediately.
Re: COMMERCIAL:Re: Upgrading from 1.1.9 to 1.2.18
On Mon, Jul 14, 2014 at 11:12 AM, Denning, Michael michael.denn...@kavokerrgroup.com wrote: 3 node cluster is in production. It’d difficult for me to get sign off on the change control to upgrade it. The 6 node cluster is already stood up (in aws). In an ideal scenario I’d just be able to bring the data over to the new cluster. Ok, use the copy the sstables method from the previous link? 1) fork writes so all writes go to both clusters 2) nodetool flush on source cluster 3) copy all sstables to all target nodes, being careful to avoid name collision (use rolling restart, probably, refresh is unsafe) 4) run cleanup on target nodes (this will have the same effect as doing an upgradesstables, as a bonus) 5) turn off writes to old cluster/turn on reads to new cluster If I were you, I would strongly consider not using vnodes on your new cluster. Unless you are very confident the cluster will grow above appx 10 nodes in the near future, you are likely to Just Lose from vnodes. =Rob
Re: UnavailableException
Is there a line when doing nodetool info/status like: Datacenter: datacenter1 = You need to make sure the Datacenter name matches the name specified in your replication factor Chris On Jul 14, 2014, at 12:04 PM, Ruchir Jha ruchir@gmail.com wrote: Mark, Here you go: NodeTool status: Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.10.20.15 1.62 TB256 8.1% 01a01f07-4df2-4c87-98e9-8dd38b3e4aee rack1 UN 10.10.20.19 1.66 TB256 8.3% 30ddf003-4d59-4a3e-85fa-e94e4adba1cb rack1 UN 10.10.20.35 1.62 TB256 9.0% 17cb8772-2444-46ff-8525-33746514727d rack1 UN 10.10.20.31 1.64 TB256 8.3% 1435acf9-c64d-4bcd-b6a4-abcec209815e rack1 UN 10.10.20.52 1.59 TB256 9.1% 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e rack1 UN 10.10.20.27 1.66 TB256 7.7% 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 UN 10.10.20.22 1.66 TB256 8.9% 46af9664-8975-4c91-847f-3f7b8f8d5ce2 rack1 UN 10.10.20.39 1.68 TB256 8.0% b7d44c26-4d75-4d36-a779-b7e7bdaecbc9 rack1 UN 10.10.20.45 1.49 TB256 7.7% 8d6bce33-8179-4660-8443-2cf822074ca4 rack1 UN 10.10.20.47 1.64 TB256 7.9% bcd51a92-3150-41ae-9c51-104ea154f6fa rack1 UN 10.10.20.62 1.59 TB256 8.2% 84b47313-da75-4519-94f3-3951d554a3e5 rack1 UN 10.10.20.51 1.66 TB256 8.9% 0343cd58-3686-465f-8280-56fb72d161e2 rack1 Astyanax Connection Settings: seeds :12 maxConns :16 maxConnsPerHost:16 connectTimeout :2000 socketTimeout :6 maxTimeoutCount:16 maxBlockedThreadsPerHost:16 maxOperationsPerConnection:16 DiscoveryType: RING_DESCRIBE ConnectionPoolType: TOKEN_AWARE DefaultReadConsistencyLevel: CL_QUORUM DefaultWriteConsistencyLevel: CL_QUORUM On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy mark.re...@boxever.com wrote: Can you post the output of nodetool status and your Astyanax connection settings? On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha ruchir@gmail.com wrote: This is how we create our keyspace. We just ran this command once through a cqlsh session on one of the nodes, so don't quite understand what you mean by check that your DC names match up CREATE KEYSPACE prod WITH replication = { 'class': 'NetworkTopologyStrategy', 'datacenter1': '3' }; On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink clohf...@blackbirdit.com wrote: What replication strategy are you using? if using NetworkTopolgyStrategy double check that your DC names match up (case sensitive) Chris On Jul 11, 2014, at 9:38 AM, Ruchir Jha ruchir@gmail.com wrote: Here's the complete stack trace: com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), attempts=3]UnavailableException() at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123) Caused by: UnavailableException() at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60) ... 12 more On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav ipremya...@gmail.com wrote: Please post the full exception. On Fri, Jul 11, 2014 at 1:50 PM,
Re: UnavailableException
If you list all 12 nodes in seeds list, you can try using NodeDiscoveryType.NONE instead of RING_DESCRIBE. Its been recommended that way by some anyway so if you add nodes to cluster your app wont start using it until all bootstrapping and everythings settled down. Chris On Jul 14, 2014, at 12:04 PM, Ruchir Jha ruchir@gmail.com wrote: Mark, Here you go: NodeTool status: Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.10.20.15 1.62 TB256 8.1% 01a01f07-4df2-4c87-98e9-8dd38b3e4aee rack1 UN 10.10.20.19 1.66 TB256 8.3% 30ddf003-4d59-4a3e-85fa-e94e4adba1cb rack1 UN 10.10.20.35 1.62 TB256 9.0% 17cb8772-2444-46ff-8525-33746514727d rack1 UN 10.10.20.31 1.64 TB256 8.3% 1435acf9-c64d-4bcd-b6a4-abcec209815e rack1 UN 10.10.20.52 1.59 TB256 9.1% 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e rack1 UN 10.10.20.27 1.66 TB256 7.7% 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 UN 10.10.20.22 1.66 TB256 8.9% 46af9664-8975-4c91-847f-3f7b8f8d5ce2 rack1 UN 10.10.20.39 1.68 TB256 8.0% b7d44c26-4d75-4d36-a779-b7e7bdaecbc9 rack1 UN 10.10.20.45 1.49 TB256 7.7% 8d6bce33-8179-4660-8443-2cf822074ca4 rack1 UN 10.10.20.47 1.64 TB256 7.9% bcd51a92-3150-41ae-9c51-104ea154f6fa rack1 UN 10.10.20.62 1.59 TB256 8.2% 84b47313-da75-4519-94f3-3951d554a3e5 rack1 UN 10.10.20.51 1.66 TB256 8.9% 0343cd58-3686-465f-8280-56fb72d161e2 rack1 Astyanax Connection Settings: seeds :12 maxConns :16 maxConnsPerHost:16 connectTimeout :2000 socketTimeout :6 maxTimeoutCount:16 maxBlockedThreadsPerHost:16 maxOperationsPerConnection:16 DiscoveryType: RING_DESCRIBE ConnectionPoolType: TOKEN_AWARE DefaultReadConsistencyLevel: CL_QUORUM DefaultWriteConsistencyLevel: CL_QUORUM On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy mark.re...@boxever.com wrote: Can you post the output of nodetool status and your Astyanax connection settings? On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha ruchir@gmail.com wrote: This is how we create our keyspace. We just ran this command once through a cqlsh session on one of the nodes, so don't quite understand what you mean by check that your DC names match up CREATE KEYSPACE prod WITH replication = { 'class': 'NetworkTopologyStrategy', 'datacenter1': '3' }; On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink clohf...@blackbirdit.com wrote: What replication strategy are you using? if using NetworkTopolgyStrategy double check that your DC names match up (case sensitive) Chris On Jul 11, 2014, at 9:38 AM, Ruchir Jha ruchir@gmail.com wrote: Here's the complete stack trace: com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), attempts=3]UnavailableException() at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123) Caused by: UnavailableException() at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60) ... 12 more On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav ipremya...@gmail.com wrote:
Re: UnavailableException
Yes the line is : Datacenter: datacenter1 which matches with my create keyspace command. As for the NodeDiscoveryType, we will follow it but I don't believe it to be the root of my issue here because the nodes start up atleast 6 hours before the UnavailableException and as far as adding nodes is concerned we would only do it after hours. On Mon, Jul 14, 2014 at 2:34 PM, Chris Lohfink clohf...@blackbirdit.com wrote: If you list all 12 nodes in seeds list, you can try using NodeDiscoveryType.NONE instead of RING_DESCRIBE. Its been recommended that way by some anyway so if you add nodes to cluster your app wont start using it until all bootstrapping and everythings settled down. Chris On Jul 14, 2014, at 12:04 PM, Ruchir Jha ruchir@gmail.com wrote: Mark, Here you go: *NodeTool status:* Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.10.20.15 1.62 TB256 8.1% 01a01f07-4df2-4c87-98e9-8dd38b3e4aee rack1 UN 10.10.20.19 1.66 TB256 8.3% 30ddf003-4d59-4a3e-85fa-e94e4adba1cb rack1 UN 10.10.20.35 1.62 TB256 9.0% 17cb8772-2444-46ff-8525-33746514727d rack1 UN 10.10.20.31 1.64 TB256 8.3% 1435acf9-c64d-4bcd-b6a4-abcec209815e rack1 UN 10.10.20.52 1.59 TB256 9.1% 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e rack1 UN 10.10.20.27 1.66 TB256 7.7% 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 UN 10.10.20.22 1.66 TB256 8.9% 46af9664-8975-4c91-847f-3f7b8f8d5ce2 rack1 UN 10.10.20.39 1.68 TB256 8.0% b7d44c26-4d75-4d36-a779-b7e7bdaecbc9 rack1 UN 10.10.20.45 1.49 TB256 7.7% 8d6bce33-8179-4660-8443-2cf822074ca4 rack1 UN 10.10.20.47 1.64 TB256 7.9% bcd51a92-3150-41ae-9c51-104ea154f6fa rack1 UN 10.10.20.62 1.59 TB256 8.2% 84b47313-da75-4519-94f3-3951d554a3e5 rack1 UN 10.10.20.51 1.66 TB256 8.9% 0343cd58-3686-465f-8280-56fb72d161e2 rack1 *Astyanax Connection Settings:* seeds :12 maxConns :16 maxConnsPerHost:16 connectTimeout :2000 socketTimeout :6 maxTimeoutCount:16 maxBlockedThreadsPerHost:16 maxOperationsPerConnection:16 DiscoveryType: RING_DESCRIBE ConnectionPoolType: TOKEN_AWARE DefaultReadConsistencyLevel: CL_QUORUM DefaultWriteConsistencyLevel: CL_QUORUM On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy mark.re...@boxever.com wrote: Can you post the output of nodetool status and your Astyanax connection settings? On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha ruchir@gmail.com wrote: This is how we create our keyspace. We just ran this command once through a cqlsh session on one of the nodes, so don't quite understand what you mean by check that your DC names match up CREATE KEYSPACE prod WITH replication = { 'class': 'NetworkTopologyStrategy', 'datacenter1': '3' }; On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink clohf...@blackbirdit.com wrote: What replication strategy are you using? if using NetworkTopolgyStrategy double check that your DC names match up (case sensitive) Chris On Jul 11, 2014, at 9:38 AM, Ruchir Jha ruchir@gmail.com wrote: Here's the complete stack trace: com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), attempts=3]UnavailableException() at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79) at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123) Caused by: UnavailableException() at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at
Re: Multi-column range scans
Thanks for both your help, greatly appreciated. We'll proceed down the path of putting the filtering into the application logic for the time being. Matt. On Tue, Jul 15, 2014 at 1:20 AM, DuyHai Doan doanduy...@gmail.com wrote: Exact Ken, I get bitten again by the semantics of composite tuples. This kind of query won't be possible until something like wide row end slice predicate is available ( https://issues.apache.org/jira/browse/CASSANDRA-6167), if it will one day On Mon, Jul 14, 2014 at 5:02 PM, Ken Hancock ken.hanc...@schange.com wrote: I don't think your query is doing what he wants. Your query will correctly set the starting point, but will also return larger interval_id's but with lower skill_levels: cqlsh:test select * from skill_count where skill='Complaints' and (interval_id, skill_level) = (140235930, 5); skill | interval_id | skill_level | skill_count +---+-+- Complaints | 140235930 | 5 | 20 Complaints | 140235930 | 8 | 30 Complaints | 140235930 | 10 | 1 Complaints | 140235940 | 2 | 10 Complaints | 140235940 | 8 | 30 (5 rows) cqlsh:test select * from skill_count where skill='Complaints' and (interval_id, skill_level) = (140235930, 5) and (interval_id) (140235990); skill | interval_id | skill_level | skill_count +---+-+- Complaints | 140235930 | 5 | 20 - desired Complaints | 140235930 | 8 | 30 - desired Complaints | 140235930 | 10 | 1 - desired Complaints | 140235940 | 2 | 10 - SKIP Complaints | 140235940 | 8 | 30 - desired The query results in a discontinuous range slice so isn't supported -- Essentially, the client will have to read the entire range and perform client-side filtering. Whether this is efficient depends on the cardinality of skill_level. I tried playing with the allow filtering cql clause, but it would appear from the documentation it's very restrictive... On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan doanduy...@gmail.com wrote: or : select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and (interval_id) (140235990) Strange enough, when starting using tuple notation you'll need to stick to it even if there is only one element in the tuple On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan doanduy...@gmail.com wrote: Sorry, I've just checked, the correct query should be: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and (interval_id,skill_level) (140235990,11) On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com wrote: Hello Mathew Since Cassandra 2.0.6 it is possible to query over composites: https://issues.apache.org/jira/browse/CASSANDRA-4851 For your example: select * from skill_count where skill='Complaints' and (interval_id,skill_level) = (140235930,5) and interval_id 140235990; On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, We have a roll-up table that as follows. CREATE TABLE SKILL_COUNT ( skill text, interval_id bigint, skill_level int, skill_count int, PRIMARY KEY (skill, interval_id, skill_level)); Essentially, skill = a names skill i.e. Complaints interval_id = a rounded epoch time (15 minute intervals) skill_level = a number/rating from 1-10 skill_count = the number of people with the specified skill, with the specified skill level, logged in at the interval_id We'd like to run the following query against it select * from skill_count where skill='Complaints' and interval_id = 140235930 and interval_id 140235990 and skill_level = 5; to get a count of people with the relevant skill and level at the appropriate time. However I am getting the following message. Bad Request: PRIMARY KEY part skill_level cannot be restricted (preceding part interval_id is either not restricted or by a non-EQ relation) Looking at how the data is stored ... --- RowKey: Complaints = (name=140235930:2:, value=, timestamp=1405308260403000) = (name=140235930:2:skill_count, value=000a, timestamp=1405308260403000) = (name=140235930:5:, value=, timestamp=1405308260403001) = (name=140235930:5:skill_count, value=0014, timestamp=1405308260403001) = (name=140235930:8:, value=, timestamp=1405308260419000) = (name=140235930:8:skill_count, value=001e, timestamp=1405308260419000) = (name=140235930:10:, value=, timestamp=1405308260419001) = (name=140235930:10:skill_count, value=0001, timestamp=1405308260419001) Should cassandra be able to
Re: high pending compactions
I'm looking into creation of monitoring thresholds for cassandra to report on its health. Does it make sense to set an alert threshold on compaction stats? If so, would setting it to a value equal to or greater than concurrent compactions make sense? Thanks, Greg On Mon, Jun 9, 2014 at 2:14 PM, S C as...@outlook.com wrote: Thank you all for quick responses. -- From: clohf...@blackbirdit.com Subject: Re: high pending compactions Date: Mon, 9 Jun 2014 14:11:36 -0500 To: user@cassandra.apache.org Bean: org.apache.cassandra.db.CompactionManager also nodetool compactionstats gives you how many are in the queue + estimate of how many will be needed. in 1.1 you will OOM *far* before you hit the limit,. In theory though, the compaction executor is a little special cased and will actually throw an exception (normally it will block) Chris On Jun 9, 2014, at 7:49 AM, S C as...@outlook.com wrote: Thank you all for valuable suggestions. Couple more questions, How to check the compaction queue? MBean/C* system log ? What happens if the queue is full? -- From: colinkuo...@gmail.com Date: Mon, 9 Jun 2014 18:53:41 +0800 Subject: Re: high pending compactions To: user@cassandra.apache.org As Jake suggested, you could firstly increase compaction_throughput_mb_per_sec and concurrent_compactions to suitable values if system resource is allowed. From my understanding, major compaction will internally acquire lock before running compaction. In your case, there might be a major compaction blocking the pending following compaction tasks. You could check the result of nodetool compactionstats and C* system log for double confirm. If the running compaction is compacting wide row for a long time, you could try to tune in_memory_compaction_limit_in_mb value. Thanks, On Sun, Jun 8, 2014 at 11:27 PM, S C as...@outlook.com wrote: I am using Cassandra 1.1 (sorry bit old) and I am seeing high pending compaction count. pending tasks: 67 while active compaction tasks are not more than 5. I have a 24CPU machine. Shouldn't I be seeing more compactions? Is this a pattern of high writes and compactions backing up? How can I improve this? Here are my thoughts. 1. Increase memtable_total_space_in_mb 2. Increase compaction_throughput_mb_per_sec 3. Increase concurrent_compactions Sorry if this was discussed already. Any pointers is much appreciated. Thanks, Kumar
Index creation sometimes fails
Hi everyone, I have some code that I've been fiddling with today that uses the DataStax Java driver to create a table and then create a secondary index on a column in that table. I've testing this code fairly thoroughly on a single-node Cassandra instance on my laptop and in unit test (using the CassandraDaemon). When running on a three-node cluster, however, I see strange behavior. Although my table always gets created, the secondary index often does not! If I delete the table and then create it again (through the same code that I've written), I've never seen the index fail to appear the second time. Does anyone have any idea what to look for here? I have no experience working on a Cassandra cluster and I wonder if maybe I am doing something dumb (I basically just installed DSE and started up the three nodes and that was it). I don't see anything that looks unusual in OpsCenter for DSE. The only thing I've noticed is that the presence of output like the following from my program after executing the command to create the index is perfectly correlated with successful creation of the index: 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Received event EVENT CREATED kiji_retail2.t_model_repo, scheduling delivery 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: [Control connection] Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [b309518a-35d2-3790-bb66-ea39bb0d188c] If anyone can give me a hand, I would really appreciate it. I am out of ideas! Best regards, Clint
Re: Index creation sometimes fails
BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver on a three-node cluster with DSE 4.5. On Mon, Jul 14, 2014 at 5:51 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi everyone, I have some code that I've been fiddling with today that uses the DataStax Java driver to create a table and then create a secondary index on a column in that table. I've testing this code fairly thoroughly on a single-node Cassandra instance on my laptop and in unit test (using the CassandraDaemon). When running on a three-node cluster, however, I see strange behavior. Although my table always gets created, the secondary index often does not! If I delete the table and then create it again (through the same code that I've written), I've never seen the index fail to appear the second time. Does anyone have any idea what to look for here? I have no experience working on a Cassandra cluster and I wonder if maybe I am doing something dumb (I basically just installed DSE and started up the three nodes and that was it). I don't see anything that looks unusual in OpsCenter for DSE. The only thing I've noticed is that the presence of output like the following from my program after executing the command to create the index is perfectly correlated with successful creation of the index: 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Received event EVENT CREATED kiji_retail2.t_model_repo, scheduling delivery 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: [Control connection] Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [b309518a-35d2-3790-bb66-ea39bb0d188c] If anyone can give me a hand, I would really appreciate it. I am out of ideas! Best regards, Clint