[RELEASE] Achilles 3.0.4

2014-07-14 Thread DuyHai Doan
Hello all

 We are happy to announce the release of Achilles 3.0.4. Among the biggest
changes:

 - support for static columns: http://goo.gl/o7D5yo
 - dynamic statements logging  tracing at runtime: http://goo.gl/w4jlqZ
 - SchemaBuilder, the mirror of QueryBuilder for creating schema
programmatically: http://goo.gl/DspJQq

 Link to the changelog: http://goo.gl/tKqpFT

  Regards

 Duy Hai DOAN


Re: keyspace with hundreds of columnfamilies

2014-07-14 Thread Ilya Sviridov
Tommaso, looking at your description of the architecture the idea came up.

You can perform sharding on cassandra client and write to different
cassandra clusters to keep the number of column families reasonable.

With best regards,
Ilya


On Thu, Jul 3, 2014 at 10:55 PM, tommaso barbugli tbarbu...@gmail.com
wrote:

 thank you for the replies; I am rethinking the schema design, one possible
 solution is to implode one dimension and get N times less CFs.
 With this approach I would come up with (cql) tables with up to 100
 columns; would that be a problem?

 Thank You,
 Tommaso


 2014-07-02 23:43 GMT+02:00 Jack Krupansky j...@basetechnology.com:

   The official answer, engraved in stone tablets, and carried down from
 the mountain: “Although having more than dozens or hundreds of tables
 defined is almost certainly a Bad Idea (just as it is a design smell in a
 relational database), it's relatively straightforward to allow disabling
 the SlabAllocator.” Emphasis on “almost certainly a Bad Idea.”

 See:
 https://issues.apache.org/jira/browse/CASSANDRA-5935
 “Allow disabling slab allocation”

 IOW, this is considered an anti-pattern, but...

 -- Jack Krupansky

  *From:* tommaso barbugli tbarbu...@gmail.com
 *Sent:* Wednesday, July 2, 2014 2:16 PM
 *To:* user@cassandra.apache.org
 *Subject:* Re: keyspace with hundreds of columnfamilies

  Hi,
 thank you for you replies on this; regarding the arena memory is this a
 fixed memory allocation or is some sort of in memory caching? I ask because
 I think that a substantial portion of the column families created will not
 be queried that frequently (and some will become inactive and stay like
 that really long time)

 Thank you,
 Tommaso


 2014-07-02 18:35 GMT+02:00 Romain HARDOUIN romain.hardo...@urssaf.fr:

 Arena allocation is an improvement feature, not a limitation.
 It was introduced in Cassandra 1.0 in order to lower memory
 fragmentation (and therefore promotion failure).
 AFAIK It's not intended to be tweaked so it might not be a good idea to
 change it.

 Best,
 Romain

 tommaso barbugli tbarbu...@gmail.com a écrit sur 02/07/2014 17:40:18 :

  De : tommaso barbugli tbarbu...@gmail.com
  A : user@cassandra.apache.org,
  Date : 02/07/2014 17:40
  Objet : Re: keyspace with hundreds of columnfamilies
  
  1MB per column family sounds pretty bad to me; is this something I
  can tweak/workaround somehow?
 
  Thanks
  Tommaso
 

  2014-07-02 17:21 GMT+02:00 Romain HARDOUIN romain.hardo...@urssaf.fr
 :
  The trap is that each CF will consume 1 MB of memory due to arena
 allocation.
  This might seem harmless but if you plan thousands of CF it means
  thousands of mega bytes...
  Up to 1,000 CF I think it could be doable, but not 10,000.
 
  Best,
 
  Romain
 
 
  tommaso barbugli tbarbu...@gmail.com a écrit sur 02/07/2014
 10:13:41 :
 
   De : tommaso barbugli tbarbu...@gmail.com
   A : user@cassandra.apache.org,
   Date : 02/07/2014 10:14
   Objet : keyspace with hundreds of columnfamilies
  
   Hi,
   Are there any known issues, shortcomings about organising data in
   hundreds of column families?
   At this present I am running with 300 column families but I expect
   that to get to a couple of thousands.
   Is this something discouraged / unsupported (I am using Cassandra
 2.0).
  
   Thanks
   Tommaso







Re: Multi-column range scans

2014-07-14 Thread DuyHai Doan
Sorry, I've just checked, the correct query should be:

select * from skill_count where skill='Complaints' and
(interval_id,skill_level) = (140235930,5) and
(interval_id,skill_level)  (140235990,11)


On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com wrote:

 Hello Mathew

  Since Cassandra 2.0.6 it is possible to query over composites:
 https://issues.apache.org/jira/browse/CASSANDRA-4851

 For your example:

 select * from skill_count where skill='Complaints' and
 (interval_id,skill_level) = (140235930,5) and interval_id 
 140235990;


 On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen matthew.j.al...@gmail.com
 wrote:

 Hi,

 We have a roll-up table that as follows.

 CREATE TABLE SKILL_COUNT (
   skill text,
   interval_id bigint,
   skill_level int,
   skill_count int,
   PRIMARY KEY (skill, interval_id, skill_level));

 Essentially,
   skill = a names skill i.e. Complaints
   interval_id = a rounded epoch time (15 minute intervals)
   skill_level = a number/rating from 1-10
   skill_count = the number of people with the specified skill, with the
 specified skill level, logged in at the interval_id

 We'd like to run the following query against it

 select * from skill_count where skill='Complaints' and interval_id =
 140235930 and interval_id  140235990 and skill_level = 5;

 to get a count of people with the relevant skill and level at the
 appropriate time.  However I am getting the following message.

 Bad Request: PRIMARY KEY part skill_level cannot be restricted (preceding
 part interval_id is either not restricted or by a non-EQ relation)

 Looking at how the data is stored ...

 ---
 RowKey: Complaints
 = (name=140235930:2:, value=, timestamp=1405308260403000)
 = (name=140235930:2:skill_count, value=000a,
 timestamp=1405308260403000)
 = (name=140235930:5:, value=, timestamp=1405308260403001)
 = (name=140235930:5:skill_count, value=0014,
 timestamp=1405308260403001)
 = (name=140235930:8:, value=, timestamp=1405308260419000)
 = (name=140235930:8:skill_count, value=001e,
 timestamp=1405308260419000)
 = (name=140235930:10:, value=, timestamp=1405308260419001)
 = (name=140235930:10:skill_count, value=0001,
 timestamp=1405308260419001)

 Should cassandra be able to allow for an extra level of filtering ? or is
 this something that should be performed from within the application.

 We have a solution working in Oracle, but would like to store this data
 in Cassandra, as all the other data that this solution relies on already
 sits within Cassandra.

 Appreciate any guidance on this matter.

 Matt





Re: Multi-column range scans

2014-07-14 Thread DuyHai Doan
or :

select * from skill_count where skill='Complaints'
and (interval_id,skill_level) = (140235930,5)
and (interval_id)  (140235990)

Strange enough, when starting using tuple notation you'll need to stick to
it even if there is only one element in the tuple


On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan doanduy...@gmail.com wrote:

 Sorry, I've just checked, the correct query should be:

 select * from skill_count where skill='Complaints' and
 (interval_id,skill_level) = (140235930,5) and
 (interval_id,skill_level)  (140235990,11)


 On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com wrote:

 Hello Mathew

  Since Cassandra 2.0.6 it is possible to query over composites:
 https://issues.apache.org/jira/browse/CASSANDRA-4851

 For your example:

 select * from skill_count where skill='Complaints' and
 (interval_id,skill_level) = (140235930,5) and interval_id 
 140235990;


 On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen matthew.j.al...@gmail.com
  wrote:

 Hi,

 We have a roll-up table that as follows.

 CREATE TABLE SKILL_COUNT (
   skill text,
   interval_id bigint,
   skill_level int,
   skill_count int,
   PRIMARY KEY (skill, interval_id, skill_level));

 Essentially,
   skill = a names skill i.e. Complaints
   interval_id = a rounded epoch time (15 minute intervals)
   skill_level = a number/rating from 1-10
   skill_count = the number of people with the specified skill, with the
 specified skill level, logged in at the interval_id

 We'd like to run the following query against it

 select * from skill_count where skill='Complaints' and interval_id =
 140235930 and interval_id  140235990 and skill_level = 5;

 to get a count of people with the relevant skill and level at the
 appropriate time.  However I am getting the following message.

 Bad Request: PRIMARY KEY part skill_level cannot be restricted
 (preceding part interval_id is either not restricted or by a non-EQ
 relation)

 Looking at how the data is stored ...

 ---
 RowKey: Complaints
 = (name=140235930:2:, value=, timestamp=1405308260403000)
 = (name=140235930:2:skill_count, value=000a,
 timestamp=1405308260403000)
 = (name=140235930:5:, value=, timestamp=1405308260403001)
 = (name=140235930:5:skill_count, value=0014,
 timestamp=1405308260403001)
 = (name=140235930:8:, value=, timestamp=1405308260419000)
 = (name=140235930:8:skill_count, value=001e,
 timestamp=1405308260419000)
 = (name=140235930:10:, value=, timestamp=1405308260419001)
 = (name=140235930:10:skill_count, value=0001,
 timestamp=1405308260419001)

 Should cassandra be able to allow for an extra level of filtering ? or
 is this something that should be performed from within the application.

 We have a solution working in Oracle, but would like to store this data
 in Cassandra, as all the other data that this solution relies on already
 sits within Cassandra.

 Appreciate any guidance on this matter.

 Matt






Re: Multi-column range scans

2014-07-14 Thread Ken Hancock
I don't think your query is doing what he wants.  Your query will correctly
set the starting point, but will also return larger interval_id's but with
lower skill_levels:

cqlsh:test select * from skill_count where skill='Complaints' and
(interval_id, skill_level) = (140235930, 5);

 skill  | interval_id   | skill_level | skill_count
+---+-+-
 Complaints | 140235930 |   5 |  20
 Complaints | 140235930 |   8 |  30
 Complaints | 140235930 |  10 |   1
 Complaints | 140235940 |   2 |  10
 Complaints | 140235940 |   8 |  30

(5 rows)

cqlsh:test select * from skill_count where skill='Complaints' and
(interval_id, skill_level) = (140235930, 5) and (interval_id) 
(140235990);

 skill  | interval_id   | skill_level | skill_count
+---+-+-
 Complaints | 140235930 |   5 |  20  - desired
 Complaints | 140235930 |   8 |  30  - desired
 Complaints | 140235930 |  10 |   1  - desired
 Complaints | 140235940 |   2 |  10  - SKIP
 Complaints | 140235940 |   8 |  30  - desired

The query results in a discontinuous range slice so isn't supported --
Essentially, the client will have to read the entire range and perform
client-side filtering.  Whether this is efficient depends on the
cardinality of skill_level.

I tried playing with the allow filtering cql clause, but it would appear
from the documentation it's very restrictive...





On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan doanduy...@gmail.com wrote:

 or :


 select * from skill_count where skill='Complaints'
 and (interval_id,skill_level) = (140235930,5)
 and (interval_id)  (140235990)

 Strange enough, when starting using tuple notation you'll need to stick to
 it even if there is only one element in the tuple


 On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan doanduy...@gmail.com wrote:

 Sorry, I've just checked, the correct query should be:

 select * from skill_count where skill='Complaints' and
 (interval_id,skill_level) = (140235930,5) and
 (interval_id,skill_level)  (140235990,11)


 On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com
 wrote:

 Hello Mathew

  Since Cassandra 2.0.6 it is possible to query over composites:
 https://issues.apache.org/jira/browse/CASSANDRA-4851

 For your example:

 select * from skill_count where skill='Complaints' and
 (interval_id,skill_level) = (140235930,5) and interval_id 
 140235990;


 On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen 
 matthew.j.al...@gmail.com wrote:

 Hi,

 We have a roll-up table that as follows.

 CREATE TABLE SKILL_COUNT (
   skill text,
   interval_id bigint,
   skill_level int,
   skill_count int,
   PRIMARY KEY (skill, interval_id, skill_level));

 Essentially,
   skill = a names skill i.e. Complaints
   interval_id = a rounded epoch time (15 minute intervals)
   skill_level = a number/rating from 1-10
   skill_count = the number of people with the specified skill, with the
 specified skill level, logged in at the interval_id

 We'd like to run the following query against it

 select * from skill_count where skill='Complaints' and interval_id =
 140235930 and interval_id  140235990 and skill_level = 5;

 to get a count of people with the relevant skill and level at the
 appropriate time.  However I am getting the following message.

 Bad Request: PRIMARY KEY part skill_level cannot be restricted
 (preceding part interval_id is either not restricted or by a non-EQ
 relation)

 Looking at how the data is stored ...

 ---
 RowKey: Complaints
 = (name=140235930:2:, value=, timestamp=1405308260403000)
 = (name=140235930:2:skill_count, value=000a,
 timestamp=1405308260403000)
 = (name=140235930:5:, value=, timestamp=1405308260403001)
 = (name=140235930:5:skill_count, value=0014,
 timestamp=1405308260403001)
 = (name=140235930:8:, value=, timestamp=1405308260419000)
 = (name=140235930:8:skill_count, value=001e,
 timestamp=1405308260419000)
 = (name=140235930:10:, value=, timestamp=1405308260419001)
 = (name=140235930:10:skill_count, value=0001,
 timestamp=1405308260419001)

 Should cassandra be able to allow for an extra level of filtering ? or
 is this something that should be performed from within the application.

 We have a solution working in Oracle, but would like to store this data
 in Cassandra, as all the other data that this solution relies on already
 sits within Cassandra.

 Appreciate any guidance on this matter.

 Matt







-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC
http://www.schange.com/en-US/Company/InvestorRelations.aspx
Office: +1 (978) 

Re: Multi-column range scans

2014-07-14 Thread DuyHai Doan
Exact Ken, I get bitten again by the semantics of composite tuples.

 This kind of query won't be possible until something like wide row end
slice predicate is available (
https://issues.apache.org/jira/browse/CASSANDRA-6167), if it will one day




On Mon, Jul 14, 2014 at 5:02 PM, Ken Hancock ken.hanc...@schange.com
wrote:

 I don't think your query is doing what he wants.  Your query will
 correctly set the starting point, but will also return larger interval_id's
 but with lower skill_levels:

 cqlsh:test select * from skill_count where skill='Complaints' and
 (interval_id, skill_level) = (140235930, 5);

  skill  | interval_id   | skill_level | skill_count
 +---+-+-
  Complaints | 140235930 |   5 |  20
  Complaints | 140235930 |   8 |  30
  Complaints | 140235930 |  10 |   1
  Complaints | 140235940 |   2 |  10
  Complaints | 140235940 |   8 |  30

 (5 rows)

 cqlsh:test select * from skill_count where skill='Complaints' and
 (interval_id, skill_level) = (140235930, 5) and (interval_id) 
 (140235990);

  skill  | interval_id   | skill_level | skill_count
 +---+-+-
  Complaints | 140235930 |   5 |  20  - desired
  Complaints | 140235930 |   8 |  30  - desired
  Complaints | 140235930 |  10 |   1  - desired
  Complaints | 140235940 |   2 |  10  - SKIP
  Complaints | 140235940 |   8 |  30  - desired

 The query results in a discontinuous range slice so isn't supported --
 Essentially, the client will have to read the entire range and perform
 client-side filtering.  Whether this is efficient depends on the
 cardinality of skill_level.

 I tried playing with the allow filtering cql clause, but it would appear
 from the documentation it's very restrictive...





 On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan doanduy...@gmail.com wrote:

 or :


 select * from skill_count where skill='Complaints'
 and (interval_id,skill_level) = (140235930,5)
 and (interval_id)  (140235990)

 Strange enough, when starting using tuple notation you'll need to stick
 to it even if there is only one element in the tuple


 On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan doanduy...@gmail.com
 wrote:

 Sorry, I've just checked, the correct query should be:

 select * from skill_count where skill='Complaints' and
 (interval_id,skill_level) = (140235930,5) and
 (interval_id,skill_level)  (140235990,11)


 On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com
 wrote:

 Hello Mathew

  Since Cassandra 2.0.6 it is possible to query over composites:
 https://issues.apache.org/jira/browse/CASSANDRA-4851

 For your example:

 select * from skill_count where skill='Complaints' and
 (interval_id,skill_level) = (140235930,5) and interval_id 
 140235990;


 On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen 
 matthew.j.al...@gmail.com wrote:

 Hi,

 We have a roll-up table that as follows.

 CREATE TABLE SKILL_COUNT (
   skill text,
   interval_id bigint,
   skill_level int,
   skill_count int,
   PRIMARY KEY (skill, interval_id, skill_level));

 Essentially,
   skill = a names skill i.e. Complaints
   interval_id = a rounded epoch time (15 minute intervals)
   skill_level = a number/rating from 1-10
   skill_count = the number of people with the specified skill, with
 the specified skill level, logged in at the interval_id

 We'd like to run the following query against it

 select * from skill_count where skill='Complaints' and interval_id =
 140235930 and interval_id  140235990 and skill_level = 5;

 to get a count of people with the relevant skill and level at the
 appropriate time.  However I am getting the following message.

 Bad Request: PRIMARY KEY part skill_level cannot be restricted
 (preceding part interval_id is either not restricted or by a non-EQ
 relation)

 Looking at how the data is stored ...

 ---
 RowKey: Complaints
 = (name=140235930:2:, value=, timestamp=1405308260403000)
 = (name=140235930:2:skill_count, value=000a,
 timestamp=1405308260403000)
 = (name=140235930:5:, value=, timestamp=1405308260403001)
 = (name=140235930:5:skill_count, value=0014,
 timestamp=1405308260403001)
 = (name=140235930:8:, value=, timestamp=1405308260419000)
 = (name=140235930:8:skill_count, value=001e,
 timestamp=1405308260419000)
 = (name=140235930:10:, value=, timestamp=1405308260419001)
 = (name=140235930:10:skill_count, value=0001,
 timestamp=1405308260419001)

 Should cassandra be able to allow for an extra level of filtering ? or
 is this something that should be performed from within the application.

 We have a solution working in Oracle, but would like to store this
 data in Cassandra, as all the other data that this 

Upgrading from 1.1.9 to 1.2.18

2014-07-14 Thread Denning, Michael
Hello All,

I'm trying to upgrade from a 3 node 1.1.9 cluster to a 6 node 1.2.18 cluster on 
ubuntu. Can sstableloader be used to stream from the existing cluster to the 
new cluster? If so, what that the suggested method? I keep getting the 
following when trying this:

partitioner org.apache.cassandra.dht.RandomPartitioner does not match system 
partitioner org.apache.cassandra.dht.Murmur3Partitioner. Note that the default 
partitioner starting with Cassandra 1.2 is Murmur3Partitioner, so you will need 
to edit that to match your old partitioner if upgrading.

It would appear that 1.1.9 doesn't have Murmur3Partitioner though, so I changed 
the partitioner on the new cluster to RandomPartitioner. Even with that, I get 
the following error:

CLASSPATH=/etc/cassandra/conf/cassandra.yaml:/root/lib_cass15/apache-cassandra-1.2.18.jar:/root/lib_cass15/guava-13.0.1.jar:/etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-1.1.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-1.1.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-1.1.9.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.7.0.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/lib/stress.jar
 Could not retrieve endpoint ranges: java.lang.RuntimeException: Could not 
retrieve endpoint ranges: at 
org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:233) 
at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:119) 
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:67) Caused by: 
org.apache.thrift.transport.TTransportException at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at 
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
 at 
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at 
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at 
org.apache.cassandra.thrift.Cassandra$Client.recv_describe_ring(Cassandra.java:1155)
 at 
org.apache.cassandra.thrift.Cassandra$Client.describe_ring(Cassandra.java:1142) 
at 
org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:212) 
... 2 more

Is there a way to get sstableloader to work? If not, can someone point me to 
documentation explaining other ways to migrate the data/keyspaces? I haven't 
been able to find any detailed docs.

Thank you



If you received this message and have reason to believe the sender did not 
intend to direct it to you, please notify the sender immediately by e-mail and 
delete the message from your system. This message (including any attachments) 
may contain confidential and/or proprietary information that should be read 
only by certain individuals. As a result, any unauthorized disclosure, copying, 
or distribution of this e-mail and the information contained herein is strictly 
prohibited and may constitute a violation of law. If you have any questions 
about this e-mail please notify the sender immediately.


Re: UnavailableException

2014-07-14 Thread Ruchir Jha
Mark,

Here you go:

*NodeTool status:*

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns   Host ID
  Rack
UN  10.10.20.15  1.62 TB256 8.1%
01a01f07-4df2-4c87-98e9-8dd38b3e4aee  rack1
UN  10.10.20.19  1.66 TB256 8.3%
30ddf003-4d59-4a3e-85fa-e94e4adba1cb  rack1
UN  10.10.20.35  1.62 TB256 9.0%
17cb8772-2444-46ff-8525-33746514727d  rack1
UN  10.10.20.31  1.64 TB256 8.3%
1435acf9-c64d-4bcd-b6a4-abcec209815e  rack1
UN  10.10.20.52  1.59 TB256 9.1%
6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e  rack1
UN  10.10.20.27  1.66 TB256 7.7%
76023cdd-c42d-4068-8b53-ae94584b8b04  rack1
UN  10.10.20.22  1.66 TB256 8.9%
46af9664-8975-4c91-847f-3f7b8f8d5ce2  rack1
UN  10.10.20.39  1.68 TB256 8.0%
b7d44c26-4d75-4d36-a779-b7e7bdaecbc9  rack1
UN  10.10.20.45  1.49 TB256 7.7%
8d6bce33-8179-4660-8443-2cf822074ca4  rack1
UN  10.10.20.47  1.64 TB256 7.9%
bcd51a92-3150-41ae-9c51-104ea154f6fa  rack1
UN  10.10.20.62  1.59 TB256 8.2%
84b47313-da75-4519-94f3-3951d554a3e5  rack1
UN  10.10.20.51  1.66 TB256 8.9%
0343cd58-3686-465f-8280-56fb72d161e2  rack1


*Astyanax Connection Settings:*

seeds   :12
maxConns   :16
maxConnsPerHost:16
connectTimeout :2000
socketTimeout  :6
maxTimeoutCount:16
maxBlockedThreadsPerHost:16
maxOperationsPerConnection:16
DiscoveryType: RING_DESCRIBE
ConnectionPoolType: TOKEN_AWARE
DefaultReadConsistencyLevel: CL_QUORUM
DefaultWriteConsistencyLevel: CL_QUORUM



On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy mark.re...@boxever.com wrote:

 Can you post the output of nodetool status and your Astyanax connection
 settings?


 On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha ruchir@gmail.com wrote:

 This is how we create our keyspace. We just ran this command once through
 a cqlsh session on one of the nodes, so don't quite understand what you
 mean by check that your DC names match up

 CREATE KEYSPACE prod WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'datacenter1': '3'
 };



 On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink clohf...@blackbirdit.com
 wrote:

 What replication strategy are you using? if using NetworkTopolgyStrategy
 double check that your DC names match up (case sensitive)

 Chris

 On Jul 11, 2014, at 9:38 AM, Ruchir Jha ruchir@gmail.com wrote:

 Here's the complete stack trace:

 com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException:
 TokenRangeOfflineException:
 [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874),
 attempts=3]UnavailableException()
 at
 com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
 at
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
 at
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
 at
 com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
 at
 com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
 at
 com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
 at
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
 at
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
 at
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
 Caused by: UnavailableException()
 at
 org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
 at
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
 at
 org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
 at
 org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
 at
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
 at
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
 at
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
 ... 12 more



 On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav ipremya...@gmail.com
 wrote:

 Please post the full exception.


 On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha ruchir@gmail.com
 wrote:

 We have a 12 node cluster and we are consistently seeing this
 exception being thrown during peak write traffic. We have a replication
 factor of 3 and a write consistency level of QUORUM. Also note there is no
 unusual Or Full GC activity during this time. Appreciate any help.

 Sent from my iPhone









Re: Cassandra use cases/Strengths/Weakness

2014-07-14 Thread Keith Freeman
We've struggled getting consistent write latency  linear write 
scalability with a pretty heavy insert load (1000's of records/second), 
and our records are about 1k-2k of data (mix of integer/string columns 
and a blob).  Wondering if you have any rough numbers for your small to 
medium write sizes experience?


On 07/04/2014 01:58 PM, James Horey wrote:

...
* Low write latency with respect to small to medium write sizes (logs, 
sensor data, etc.)

* Linear write scalability
* ...




Re: Upgrading from 1.1.9 to 1.2.18

2014-07-14 Thread Robert Coli
On Mon, Jul 14, 2014 at 9:54 AM, Denning, Michael 
michael.denn...@kavokerrgroup.com wrote:

  I'm trying to upgrade from a 3 node 1.1.9 cluster to a 6 node 1.2.18
 cluster on ubuntu. Can sstableloader be used to stream from the existing
 cluster to the new cluster? If so, what that the suggested method? I keep
 getting the following when trying this:

http://palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra

One of the caveats mentioned there is that sstableloader often does not
work between major versions.

If I were you, I would accomplish this task by dividing it in two :

1) Upgrade my 3 node cluster from 1.1.9 to 1.2.18 via rolling
restart/upgradesstables.
2) Expand 3 node cluster to 6 nodes

Is there a reason you are not using this process?

=Rob


RE: COMMERCIAL:Re: Upgrading from 1.1.9 to 1.2.18

2014-07-14 Thread Denning, Michael
3 node cluster is in production.  It’d difficult for me to get sign off on the 
change control to upgrade it.   The 6 node cluster is already stood up (in 
aws).  In an ideal scenario I’d just be able to bring the data over to the new 
cluster.


From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: Monday, July 14, 2014 1:53 PM
To: user@cassandra.apache.org
Subject: COMMERCIAL:Re: Upgrading from 1.1.9 to 1.2.18

On Mon, Jul 14, 2014 at 9:54 AM, Denning, Michael 
michael.denn...@kavokerrgroup.commailto:michael.denn...@kavokerrgroup.com 
wrote:

I'm trying to upgrade from a 3 node 1.1.9 cluster to a 6 node 1.2.18 cluster on 
ubuntu. Can sstableloader be used to stream from the existing cluster to the 
new cluster? If so, what that the suggested method? I keep getting the 
following when trying this:
http://palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra

One of the caveats mentioned there is that sstableloader often does not work 
between major versions.

If I were you, I would accomplish this task by dividing it in two :

1) Upgrade my 3 node cluster from 1.1.9 to 1.2.18 via rolling 
restart/upgradesstables.
2) Expand 3 node cluster to 6 nodes

Is there a reason you are not using this process?

=Rob



If you received this message and have reason to believe the sender did not 
intend to direct it to you, please notify the sender immediately by e-mail and 
delete the message from your system. This message (including any attachments) 
may contain confidential and/or proprietary information that should be read 
only by certain individuals. As a result, any unauthorized disclosure, copying, 
or distribution of this e-mail and the information contained herein is strictly 
prohibited and may constitute a violation of law. If you have any questions 
about this e-mail please notify the sender immediately.


Re: COMMERCIAL:Re: Upgrading from 1.1.9 to 1.2.18

2014-07-14 Thread Robert Coli
On Mon, Jul 14, 2014 at 11:12 AM, Denning, Michael 
michael.denn...@kavokerrgroup.com wrote:

  3 node cluster is in production.  It’d difficult for me to get sign off
 on the change control to upgrade it.   The 6 node cluster is already stood
 up (in aws).  In an ideal scenario I’d just be able to bring the data over
 to the new cluster.


Ok, use the copy the sstables method from the previous link?

1) fork writes so all writes go to both clusters
2) nodetool flush on source cluster
3) copy all sstables to all target nodes, being careful to avoid name
collision (use rolling restart, probably, refresh is unsafe)
4) run cleanup on target nodes (this will have the same effect as doing an
upgradesstables, as a bonus)
5) turn off writes to old cluster/turn on reads to new cluster

If I were you, I would strongly consider not using vnodes on your new
cluster. Unless you are very confident the cluster will grow above appx 10
nodes in the near future, you are likely to Just Lose from vnodes.

=Rob


Re: UnavailableException

2014-07-14 Thread Chris Lohfink
Is there a line when doing nodetool info/status like: 

Datacenter: datacenter1
=

You need to make sure the Datacenter name matches the name specified in your 
replication factor

Chris

On Jul 14, 2014, at 12:04 PM, Ruchir Jha ruchir@gmail.com wrote:

 Mark,
 
 Here you go:
 
 NodeTool status:
 
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns   Host ID
Rack
 UN  10.10.20.15  1.62 TB256 8.1%   
 01a01f07-4df2-4c87-98e9-8dd38b3e4aee  rack1
 UN  10.10.20.19  1.66 TB256 8.3%   
 30ddf003-4d59-4a3e-85fa-e94e4adba1cb  rack1
 UN  10.10.20.35  1.62 TB256 9.0%   
 17cb8772-2444-46ff-8525-33746514727d  rack1
 UN  10.10.20.31  1.64 TB256 8.3%   
 1435acf9-c64d-4bcd-b6a4-abcec209815e  rack1
 UN  10.10.20.52  1.59 TB256 9.1%   
 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e  rack1
 UN  10.10.20.27  1.66 TB256 7.7%   
 76023cdd-c42d-4068-8b53-ae94584b8b04  rack1
 UN  10.10.20.22  1.66 TB256 8.9%   
 46af9664-8975-4c91-847f-3f7b8f8d5ce2  rack1
 UN  10.10.20.39  1.68 TB256 8.0%   
 b7d44c26-4d75-4d36-a779-b7e7bdaecbc9  rack1
 UN  10.10.20.45  1.49 TB256 7.7%   
 8d6bce33-8179-4660-8443-2cf822074ca4  rack1
 UN  10.10.20.47  1.64 TB256 7.9%   
 bcd51a92-3150-41ae-9c51-104ea154f6fa  rack1
 UN  10.10.20.62  1.59 TB256 8.2%   
 84b47313-da75-4519-94f3-3951d554a3e5  rack1
 UN  10.10.20.51  1.66 TB256 8.9%   
 0343cd58-3686-465f-8280-56fb72d161e2  rack1
 
 
 Astyanax Connection Settings:
 
 seeds   :12
 maxConns   :16
 maxConnsPerHost:16
 connectTimeout :2000
 socketTimeout  :6
 maxTimeoutCount:16
 maxBlockedThreadsPerHost:16
 maxOperationsPerConnection:16
 DiscoveryType: RING_DESCRIBE
 ConnectionPoolType: TOKEN_AWARE
 DefaultReadConsistencyLevel: CL_QUORUM
 DefaultWriteConsistencyLevel: CL_QUORUM
 
 
 
 On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy mark.re...@boxever.com wrote:
 Can you post the output of nodetool status and your Astyanax connection 
 settings?
 
 
 On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha ruchir@gmail.com wrote:
 This is how we create our keyspace. We just ran this command once through a 
 cqlsh session on one of the nodes, so don't quite understand what you mean by 
 check that your DC names match up
 
 CREATE KEYSPACE prod WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'datacenter1': '3'
 };
 
 
 
 On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink clohf...@blackbirdit.com 
 wrote:
 What replication strategy are you using? if using NetworkTopolgyStrategy 
 double check that your DC names match up (case sensitive)
 
 Chris
 
 On Jul 11, 2014, at 9:38 AM, Ruchir Jha ruchir@gmail.com wrote:
 
 Here's the complete stack trace:
 
 com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: 
 TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, 
 latency=22784(42874), attempts=3]UnavailableException()
 at 
 com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
 at 
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
 at 
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
 at 
 com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
 at 
 com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
 at 
 com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
 Caused by: UnavailableException()
 at 
 org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
 at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
 at 
 org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
 at 
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
 ... 12 more
 
 
 
 On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav ipremya...@gmail.com wrote:
 Please post the full exception.
 
 
 On Fri, Jul 11, 2014 at 1:50 PM, 

Re: UnavailableException

2014-07-14 Thread Chris Lohfink
If you list all 12 nodes in seeds list, you can try using 
NodeDiscoveryType.NONE instead of RING_DESCRIBE.  

Its been recommended that way by some anyway so if you add nodes to cluster 
your app wont start using it until all bootstrapping and everythings settled 
down.

Chris

On Jul 14, 2014, at 12:04 PM, Ruchir Jha ruchir@gmail.com wrote:

 Mark,
 
 Here you go:
 
 NodeTool status:
 
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns   Host ID
Rack
 UN  10.10.20.15  1.62 TB256 8.1%   
 01a01f07-4df2-4c87-98e9-8dd38b3e4aee  rack1
 UN  10.10.20.19  1.66 TB256 8.3%   
 30ddf003-4d59-4a3e-85fa-e94e4adba1cb  rack1
 UN  10.10.20.35  1.62 TB256 9.0%   
 17cb8772-2444-46ff-8525-33746514727d  rack1
 UN  10.10.20.31  1.64 TB256 8.3%   
 1435acf9-c64d-4bcd-b6a4-abcec209815e  rack1
 UN  10.10.20.52  1.59 TB256 9.1%   
 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e  rack1
 UN  10.10.20.27  1.66 TB256 7.7%   
 76023cdd-c42d-4068-8b53-ae94584b8b04  rack1
 UN  10.10.20.22  1.66 TB256 8.9%   
 46af9664-8975-4c91-847f-3f7b8f8d5ce2  rack1
 UN  10.10.20.39  1.68 TB256 8.0%   
 b7d44c26-4d75-4d36-a779-b7e7bdaecbc9  rack1
 UN  10.10.20.45  1.49 TB256 7.7%   
 8d6bce33-8179-4660-8443-2cf822074ca4  rack1
 UN  10.10.20.47  1.64 TB256 7.9%   
 bcd51a92-3150-41ae-9c51-104ea154f6fa  rack1
 UN  10.10.20.62  1.59 TB256 8.2%   
 84b47313-da75-4519-94f3-3951d554a3e5  rack1
 UN  10.10.20.51  1.66 TB256 8.9%   
 0343cd58-3686-465f-8280-56fb72d161e2  rack1
 
 
 Astyanax Connection Settings:
 
 seeds   :12
 maxConns   :16
 maxConnsPerHost:16
 connectTimeout :2000
 socketTimeout  :6
 maxTimeoutCount:16
 maxBlockedThreadsPerHost:16
 maxOperationsPerConnection:16
 DiscoveryType: RING_DESCRIBE
 ConnectionPoolType: TOKEN_AWARE
 DefaultReadConsistencyLevel: CL_QUORUM
 DefaultWriteConsistencyLevel: CL_QUORUM
 
 
 
 On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy mark.re...@boxever.com wrote:
 Can you post the output of nodetool status and your Astyanax connection 
 settings?
 
 
 On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha ruchir@gmail.com wrote:
 This is how we create our keyspace. We just ran this command once through a 
 cqlsh session on one of the nodes, so don't quite understand what you mean by 
 check that your DC names match up
 
 CREATE KEYSPACE prod WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'datacenter1': '3'
 };
 
 
 
 On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink clohf...@blackbirdit.com 
 wrote:
 What replication strategy are you using? if using NetworkTopolgyStrategy 
 double check that your DC names match up (case sensitive)
 
 Chris
 
 On Jul 11, 2014, at 9:38 AM, Ruchir Jha ruchir@gmail.com wrote:
 
 Here's the complete stack trace:
 
 com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: 
 TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, 
 latency=22784(42874), attempts=3]UnavailableException()
 at 
 com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
 at 
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
 at 
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
 at 
 com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
 at 
 com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
 at 
 com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
 Caused by: UnavailableException()
 at 
 org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
 at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
 at 
 org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
 at 
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
 at 
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
 ... 12 more
 
 
 
 On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav ipremya...@gmail.com wrote:
 

Re: UnavailableException

2014-07-14 Thread Ruchir Jha
Yes the line is : Datacenter: datacenter1 which matches with my create
keyspace command. As for the NodeDiscoveryType, we will follow it but I
don't believe it to be the root of my issue here because the nodes start up
atleast 6 hours before the UnavailableException and as far as adding nodes
is concerned we would only do it after hours.


On Mon, Jul 14, 2014 at 2:34 PM, Chris Lohfink clohf...@blackbirdit.com
wrote:

 If you list all 12 nodes in seeds list, you can try using
 NodeDiscoveryType.NONE instead of RING_DESCRIBE.

 Its been recommended that way by some anyway so if you add nodes to
 cluster your app wont start using it until all bootstrapping and
 everythings settled down.

 Chris

 On Jul 14, 2014, at 12:04 PM, Ruchir Jha ruchir@gmail.com wrote:

 Mark,

 Here you go:

 *NodeTool status:*

 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns   Host ID
   Rack
 UN  10.10.20.15  1.62 TB256 8.1%
 01a01f07-4df2-4c87-98e9-8dd38b3e4aee  rack1
 UN  10.10.20.19  1.66 TB256 8.3%
 30ddf003-4d59-4a3e-85fa-e94e4adba1cb  rack1
 UN  10.10.20.35  1.62 TB256 9.0%
 17cb8772-2444-46ff-8525-33746514727d  rack1
 UN  10.10.20.31  1.64 TB256 8.3%
 1435acf9-c64d-4bcd-b6a4-abcec209815e  rack1
 UN  10.10.20.52  1.59 TB256 9.1%
 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e  rack1
 UN  10.10.20.27  1.66 TB256 7.7%
 76023cdd-c42d-4068-8b53-ae94584b8b04  rack1
 UN  10.10.20.22  1.66 TB256 8.9%
 46af9664-8975-4c91-847f-3f7b8f8d5ce2  rack1
 UN  10.10.20.39  1.68 TB256 8.0%
 b7d44c26-4d75-4d36-a779-b7e7bdaecbc9  rack1
 UN  10.10.20.45  1.49 TB256 7.7%
 8d6bce33-8179-4660-8443-2cf822074ca4  rack1
 UN  10.10.20.47  1.64 TB256 7.9%
 bcd51a92-3150-41ae-9c51-104ea154f6fa  rack1
 UN  10.10.20.62  1.59 TB256 8.2%
 84b47313-da75-4519-94f3-3951d554a3e5  rack1
 UN  10.10.20.51  1.66 TB256 8.9%
 0343cd58-3686-465f-8280-56fb72d161e2  rack1


 *Astyanax Connection Settings:*

 seeds   :12
 maxConns   :16
 maxConnsPerHost:16
 connectTimeout :2000
 socketTimeout  :6
 maxTimeoutCount:16
 maxBlockedThreadsPerHost:16
 maxOperationsPerConnection:16
 DiscoveryType: RING_DESCRIBE
 ConnectionPoolType: TOKEN_AWARE
 DefaultReadConsistencyLevel: CL_QUORUM
 DefaultWriteConsistencyLevel: CL_QUORUM



 On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy mark.re...@boxever.com
 wrote:

 Can you post the output of nodetool status and your Astyanax connection
 settings?


 On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha ruchir@gmail.com wrote:

 This is how we create our keyspace. We just ran this command once
 through a cqlsh session on one of the nodes, so don't quite understand what
 you mean by check that your DC names match up

 CREATE KEYSPACE prod WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'datacenter1': '3'
 };



 On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink clohf...@blackbirdit.com
  wrote:

 What replication strategy are you using? if using
 NetworkTopolgyStrategy double check that your DC names match up (case
 sensitive)

 Chris

 On Jul 11, 2014, at 9:38 AM, Ruchir Jha ruchir@gmail.com wrote:

 Here's the complete stack trace:

 com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException:
 TokenRangeOfflineException:
 [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874),
 attempts=3]UnavailableException()
 at
 com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
 at
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
 at
 com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
 at
 com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
 at
 com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
 at
 com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
 at
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
 at
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
 at
 com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
 Caused by: UnavailableException()
 at
 org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
 at
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
 at
 org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
 at
 org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
 at
 

Re: Multi-column range scans

2014-07-14 Thread Matthew Allen
Thanks for both your help, greatly appreciated.

We'll proceed down the path of putting the filtering into the application
logic for the time being.

Matt.


On Tue, Jul 15, 2014 at 1:20 AM, DuyHai Doan doanduy...@gmail.com wrote:

 Exact Ken, I get bitten again by the semantics of composite tuples.

  This kind of query won't be possible until something like wide row end
 slice predicate is available (
 https://issues.apache.org/jira/browse/CASSANDRA-6167), if it will one day




 On Mon, Jul 14, 2014 at 5:02 PM, Ken Hancock ken.hanc...@schange.com
 wrote:

 I don't think your query is doing what he wants.  Your query will
 correctly set the starting point, but will also return larger interval_id's
 but with lower skill_levels:

 cqlsh:test select * from skill_count where skill='Complaints' and
 (interval_id, skill_level) = (140235930, 5);

  skill  | interval_id   | skill_level | skill_count
 +---+-+-
  Complaints | 140235930 |   5 |  20
  Complaints | 140235930 |   8 |  30
  Complaints | 140235930 |  10 |   1
  Complaints | 140235940 |   2 |  10
  Complaints | 140235940 |   8 |  30

 (5 rows)

 cqlsh:test select * from skill_count where skill='Complaints' and
 (interval_id, skill_level) = (140235930, 5) and (interval_id) 
 (140235990);

  skill  | interval_id   | skill_level | skill_count
 +---+-+-
  Complaints | 140235930 |   5 |  20  - desired
  Complaints | 140235930 |   8 |  30  - desired
  Complaints | 140235930 |  10 |   1  - desired
  Complaints | 140235940 |   2 |  10  - SKIP
  Complaints | 140235940 |   8 |  30  - desired

 The query results in a discontinuous range slice so isn't supported --
 Essentially, the client will have to read the entire range and perform
 client-side filtering.  Whether this is efficient depends on the
 cardinality of skill_level.

 I tried playing with the allow filtering cql clause, but it would
 appear from the documentation it's very restrictive...





 On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan doanduy...@gmail.com
 wrote:

 or :


 select * from skill_count where skill='Complaints'
 and (interval_id,skill_level) = (140235930,5)
 and (interval_id)  (140235990)

 Strange enough, when starting using tuple notation you'll need to stick
 to it even if there is only one element in the tuple


 On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan doanduy...@gmail.com
 wrote:

 Sorry, I've just checked, the correct query should be:

 select * from skill_count where skill='Complaints' and
 (interval_id,skill_level) = (140235930,5) and
 (interval_id,skill_level)  (140235990,11)


 On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan doanduy...@gmail.com
 wrote:

 Hello Mathew

  Since Cassandra 2.0.6 it is possible to query over composites:
 https://issues.apache.org/jira/browse/CASSANDRA-4851

 For your example:

 select * from skill_count where skill='Complaints' and
 (interval_id,skill_level) = (140235930,5) and interval_id 
 140235990;


 On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen 
 matthew.j.al...@gmail.com wrote:

 Hi,

 We have a roll-up table that as follows.

 CREATE TABLE SKILL_COUNT (
   skill text,
   interval_id bigint,
   skill_level int,
   skill_count int,
   PRIMARY KEY (skill, interval_id, skill_level));

 Essentially,
   skill = a names skill i.e. Complaints
   interval_id = a rounded epoch time (15 minute intervals)
   skill_level = a number/rating from 1-10
   skill_count = the number of people with the specified skill, with
 the specified skill level, logged in at the interval_id

 We'd like to run the following query against it

 select * from skill_count where skill='Complaints' and interval_id =
 140235930 and interval_id  140235990 and skill_level = 5;

 to get a count of people with the relevant skill and level at the
 appropriate time.  However I am getting the following message.

 Bad Request: PRIMARY KEY part skill_level cannot be restricted
 (preceding part interval_id is either not restricted or by a non-EQ
 relation)

 Looking at how the data is stored ...

 ---
 RowKey: Complaints
 = (name=140235930:2:, value=, timestamp=1405308260403000)
 = (name=140235930:2:skill_count, value=000a,
 timestamp=1405308260403000)
 = (name=140235930:5:, value=, timestamp=1405308260403001)
 = (name=140235930:5:skill_count, value=0014,
 timestamp=1405308260403001)
 = (name=140235930:8:, value=, timestamp=1405308260419000)
 = (name=140235930:8:skill_count, value=001e,
 timestamp=1405308260419000)
 = (name=140235930:10:, value=, timestamp=1405308260419001)
 = (name=140235930:10:skill_count, value=0001,
 timestamp=1405308260419001)

 Should cassandra be able to 

Re: high pending compactions

2014-07-14 Thread Greg Bone
I'm looking into creation of monitoring thresholds for cassandra to report
on its health. Does it make sense to set an alert threshold on compaction
stats? If so, would setting it to a value equal to or greater than
concurrent compactions make sense?

Thanks,
Greg




On Mon, Jun 9, 2014 at 2:14 PM, S C as...@outlook.com wrote:

 Thank you all for quick responses.
 --
 From: clohf...@blackbirdit.com
 Subject: Re: high pending compactions
 Date: Mon, 9 Jun 2014 14:11:36 -0500
 To: user@cassandra.apache.org

 Bean: org.apache.cassandra.db.CompactionManager

 also nodetool compactionstats gives you how many are in the queue +
 estimate of how many will be needed.

 in 1.1 you will OOM *far* before you hit the limit,.  In theory though,
 the compaction executor is a little special cased and will actually throw
 an exception (normally it will block)

 Chris

 On Jun 9, 2014, at 7:49 AM, S C as...@outlook.com wrote:

 Thank you all for valuable suggestions. Couple more questions,

 How to check the compaction queue? MBean/C* system log ?
 What happens if the queue is full?

 --
 From: colinkuo...@gmail.com
 Date: Mon, 9 Jun 2014 18:53:41 +0800
 Subject: Re: high pending compactions
 To: user@cassandra.apache.org

 As Jake suggested, you could firstly increase
 compaction_throughput_mb_per_sec and concurrent_compactions to suitable
 values if system resource is allowed. From my understanding, major
 compaction will internally acquire lock before running compaction. In your
 case, there might be a major compaction blocking the pending following
 compaction tasks. You could check the result of nodetool compactionstats
 and C* system log for double confirm.

 If the running compaction is compacting wide row for a long time, you
 could try to tune in_memory_compaction_limit_in_mb value.

 Thanks,



 On Sun, Jun 8, 2014 at 11:27 PM, S C as...@outlook.com wrote:

 I am using Cassandra 1.1 (sorry bit old) and I am seeing high pending
 compaction count. pending tasks: 67 while active compaction tasks are
 not more than 5. I have a 24CPU machine. Shouldn't I be seeing more
 compactions? Is this a pattern of high writes and compactions backing up?
 How can I improve this? Here are my thoughts.


1. Increase memtable_total_space_in_mb
2. Increase compaction_throughput_mb_per_sec
3. Increase concurrent_compactions


 Sorry if this was discussed already. Any pointers is much appreciated.

 Thanks,
 Kumar





Index creation sometimes fails

2014-07-14 Thread Clint Kelly
Hi everyone,

I have some code that I've been fiddling with today that uses the
DataStax Java driver to create a table and then create a secondary
index on a column in that table.  I've testing this code fairly
thoroughly on a single-node Cassandra instance on my laptop and in
unit test (using the CassandraDaemon).

When running on a three-node cluster, however, I see strange behavior.
Although my table always gets created, the secondary index often does
not!  If I delete the table and then create it again (through the same
code that I've written), I've never seen the index fail to appear the
second time.

Does anyone have any idea what to look for here?  I have no experience
working on a Cassandra cluster and I wonder if maybe I am doing
something dumb (I basically just installed DSE and started up the
three nodes and that was it).  I don't see anything that looks unusual
in OpsCenter for DSE.

The only thing I've noticed is that the presence of output like the
following from my program after executing the command to create the
index is perfectly correlated with successful creation of the index:

14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Received
event EVENT CREATED kiji_retail2.t_model_repo, scheduling delivery
14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection:
[Control connection] Refreshing schema for kiji_retail2
14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Refreshing
schema for kiji_retail2
14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection:
Checking for schema agreement: versions are
[9a8d72f9-e384-3aa8-bc85-185e2c303ade,
b309518a-35d2-3790-bb66-ea39bb0d188c]
14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection:
Checking for schema agreement: versions are
[9a8d72f9-e384-3aa8-bc85-185e2c303ade,
b309518a-35d2-3790-bb66-ea39bb0d188c]
14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection:
Checking for schema agreement: versions are
[9a8d72f9-e384-3aa8-bc85-185e2c303ade,
b309518a-35d2-3790-bb66-ea39bb0d188c]
14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection:
Checking for schema agreement: versions are
[9a8d72f9-e384-3aa8-bc85-185e2c303ade,
b309518a-35d2-3790-bb66-ea39bb0d188c]
14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection:
Checking for schema agreement: versions are
[b309518a-35d2-3790-bb66-ea39bb0d188c]

If anyone can give me a hand, I would really appreciate it.  I am out of ideas!

Best regards,
Clint


Re: Index creation sometimes fails

2014-07-14 Thread Clint Kelly
BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver
on a three-node cluster with DSE 4.5.

On Mon, Jul 14, 2014 at 5:51 PM, Clint Kelly clint.ke...@gmail.com wrote:
 Hi everyone,

 I have some code that I've been fiddling with today that uses the
 DataStax Java driver to create a table and then create a secondary
 index on a column in that table.  I've testing this code fairly
 thoroughly on a single-node Cassandra instance on my laptop and in
 unit test (using the CassandraDaemon).

 When running on a three-node cluster, however, I see strange behavior.
 Although my table always gets created, the secondary index often does
 not!  If I delete the table and then create it again (through the same
 code that I've written), I've never seen the index fail to appear the
 second time.

 Does anyone have any idea what to look for here?  I have no experience
 working on a Cassandra cluster and I wonder if maybe I am doing
 something dumb (I basically just installed DSE and started up the
 three nodes and that was it).  I don't see anything that looks unusual
 in OpsCenter for DSE.

 The only thing I've noticed is that the presence of output like the
 following from my program after executing the command to create the
 index is perfectly correlated with successful creation of the index:

 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Received
 event EVENT CREATED kiji_retail2.t_model_repo, scheduling delivery
 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection:
 [Control connection] Refreshing schema for kiji_retail2
 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Refreshing
 schema for kiji_retail2
 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection:
 Checking for schema agreement: versions are
 [9a8d72f9-e384-3aa8-bc85-185e2c303ade,
 b309518a-35d2-3790-bb66-ea39bb0d188c]
 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection:
 Checking for schema agreement: versions are
 [9a8d72f9-e384-3aa8-bc85-185e2c303ade,
 b309518a-35d2-3790-bb66-ea39bb0d188c]
 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection:
 Checking for schema agreement: versions are
 [9a8d72f9-e384-3aa8-bc85-185e2c303ade,
 b309518a-35d2-3790-bb66-ea39bb0d188c]
 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection:
 Checking for schema agreement: versions are
 [9a8d72f9-e384-3aa8-bc85-185e2c303ade,
 b309518a-35d2-3790-bb66-ea39bb0d188c]
 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection:
 Checking for schema agreement: versions are
 [b309518a-35d2-3790-bb66-ea39bb0d188c]

 If anyone can give me a hand, I would really appreciate it.  I am out of 
 ideas!

 Best regards,
 Clint