from:"Colin"

Moving SSTable to fix JBOD imbalance

2017-05-12 Thread Axel Colin de Verdiere

Hello !

I'm experiencing a data imbalance issue with one of my nodes within a
3-nodes C* 2.1.4 cluster. All of them are using JBOD (2 physical disks),
and this particular node seems to have recently made a relatively big
compaction (I'm using STCS), creating a 56Go SSTable file, which results in
one of the disks being 94% used and the other only 34%. I've looked around
for similar issues, and this was supposed to be fixed in 2.1.3 (
CASSANDRA-7386 ). DSE
docs

suggest
stopping the node and moving some SSTables around between the disks to
force a better balance, while trying to make as few moves as possible. Can
I just stop the node, move the 56Go SSTable (so I guess the Summary, TOC,
Digest, Statistics, CompressionInfo, Data, Index and Filter files) and
restart the node?

Thanks a lot for your help,
Best,

Axel

Re: Ring show high load average when restarting a node.

2016-12-06 Thread Colin Kuo

Hi,

The feature of speculative execution in Cassandra 2.0 helps in this case.
You can get further explanation on below link.
http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2

Thanks!

On Tue, Dec 6, 2016 at 10:13 AM, Sungju Hong  wrote:

> Hello,
>
> when I restart a node, other(most) nodes show high load average and block
> queries for one or two minutes.
> why other nodes are affected ?
>
> - I have a cluster of 70 nodes.
> - Cassandra version 1.2.3
> - RF: 3
> - disabled hinted handoff
>
> I will appreciate any advice.
>
> Thanks.
> Regards.
>
>
>

Re: Adding nodes to existing cluster

2015-04-20 Thread Colin Clark

unsubscribe

On Apr 20, 2015, at 8:08 AM, Carlos Rolo r...@pythian.com wrote:

Independent of the snitch, data needs to travel to the new nodes (plus all
the keyspace information that goes via gossip). So I won't bootstrap them all
at once, even if it is only for network traffic generated.

Don't forget to run cleanup on the old nodes once all nodes are in place to
reclaim disk space.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
http://linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com http://www.pythian.com/
On Mon, Apr 20, 2015 at 1:58 PM, Or Sher or.sh...@gmail.com
mailto:or.sh...@gmail.com wrote:
Thanks for the response.
Sure we'll monitor as we're adding nodes.
We're now using 6 nodes on each DC. (We have 2 DCs)
Each node contains ~800GB

Do you know how rack configurations are relevant here?
Do you see any reason to bootstrap them one by one if we're not using
rack awareness?

On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo r...@pythian.com
mailto:r...@pythian.com wrote:
Start one node at a time. Wait 2 minutes before starting each node.

How much data and nodes you have already? Depending on that, the streaming
of data can stress on the resources you have.
I would recommend to start one and monitor, if things are ok, add another
one. And so on.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin:
linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 tel:%2B31%206%20159%2061%20814 | Tel: +1 613 565
8696 x1649 tel:%2B1%20613%20565%208696%20x1649
www.pythian.com http://www.pythian.com/

On Mon, Apr 20, 2015 at 11:02 AM, Or Sher or.sh...@gmail.com
mailto:or.sh...@gmail.com wrote:

Hi all,
In the near future I'll need to add more than 10 nodes to a 2.0.9
cluster (using vnodes).
I read this documentation on datastax website:

http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html

In one point it says:
If you are using racks, you can safely bootstrap two nodes at a time
when both nodes are on the same rack.

And in another is says:
Start Cassandra on each new node. Allow two minutes between node
initializations. You can monitor the startup and data streaming
process using nodetool netstats.

We're not using racks configuration and from reading this
documentation I'm not really sure is it safe for us to bootstrap all
nodes together (with two minutes between each other).
I really hate the tought of doing it one by one, I assume it will take
more than 6H per node.

What do you say?
--
Or Sher

--
Or Sher

smime.p7s
Description: S/MIME cryptographic signature

Re: Cassandra 1.2.9 will not start

2015-04-18 Thread Colin Clark

unsubscribe
 On Apr 18, 2015, at 4:26 PM, Bill Miller bmil...@inthinc.com wrote:
 
 I tried restarting two nodes that were working and now I get this.  
 
 
 
 INFO 15:13:50,296 Initializing system.range_xfers
  INFO 15:13:50,300 Initializing system.schema_keyspaces
  INFO 15:13:50,301 Opening 
 /cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ic-749926 
 (597 bytes)
  INFO 15:13:50,302 Opening 
 /cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ic-749927 
 (516 bytes)
  INFO 15:13:50,302 Opening 
 /cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ic-749926 
 (597 bytes)
  INFO 15:13:50,302 Opening 
 /cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ic-749927 
 (516 bytes)
 java.lang.AssertionError
   at org.apache.cassandra.cql3.CFDefinition.init(CFDefinition.java:162)
   at 
 org.apache.cassandra.config.CFMetaData.updateCfDef(CFMetaData.java:1526)
   at 
 org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1441)
   at 
 org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:306)
   at 
 org.apache.cassandra.config.KSMetaData.fromSchema(KSMetaData.java:287)
   at org.apache.cassandra.db.DefsTable.loadFromTable(DefsTable.java:154)
   at 
 org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:563)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:254)
   at 
 org.apache.cassandra.service.CassandraDaemon.init(CassandraDaemon.java:381)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:497)
   at 
 org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:212)
 Cannot load daemon
 Service exit with a return value of 3

Re: What's to think of when increasing disk size on Cassandra nodes?

2015-04-08 Thread Colin

Yikes, 18tb/node is a very bad idea.

I dont like to go over 2-3 personally and you have to be careful with JBOD.  
See one of Ellis's latest posts on this and suggested use of LVM.  It is a 
reversal on previous position re JBOD.

--
Colin 
+1 612 859 6129
Skype colin.p.clark

 On Apr 8, 2015, at 3:11 PM, Jack Krupansky jack.krupan...@gmail.com wrote:
 
 I can certainly sympathize if you have IT staff/management who will willingly 
 spring for some disk drives, but not for full machines, even if they are 
 relatively commodity boxes. Seems penny-wise and pound-foolish to me, but 
 management has their own priorities, plus there is the pre-existing Oracle 
 mindset of dense/fat nodes as a preference.
 
 -- Jack Krupansky
 
 On Wed, Apr 8, 2015 at 2:00 PM, Nate McCall n...@thelastpickle.com wrote:
 First off, I agree that the preferred path is adding nodes, but it is 
 possible. 
 
  Can C* handle up to 18 TB data size per node with this amount of RAM?
 
 Depends on how deep in the weeds you want to get tuning and testing. See 
 below. 
 
 
  Is it feasible to increase the disk size by mounting a new (larger) disk, 
  copy all SS tables to it, and then mount it on the same mount point as the 
  original (smaller) disk (to replace it)? 
 
 Yes (with C* off of course). 
 
 As for tuning, you will need to look at, experiment with, and get a good 
 understanding of:
 - index_interval (turn this up now anyway if have not already ~ start at 512 
 and go up from there)
 - bloom filter space usage via bloom_filter_fp_chance 
 - compression metadata storage via chunk_length_kb 
 - repair time and how compaction_throughput_in_mb_per_sec and 
 stream_throughput_outbound_megabits_per_sec will effect such
 
 The first three will have a direct negative impact on read performance.
 
 You will definitely want to use JBOD so you don't have to repair everything 
 if you loose a single disk, but you will still be degraded for *a very long 
 time* when you loose a disk.  
 
 This is hard and takes experimentation and research (I can't emphasize this 
 part enough), but i've seen it work. That said, the engineering time spent 
 is probably more than buying and deploying additional hardware in the first 
 place. YMMV. 
 
 
 --
 -
 Nate McCall
 Austin, TX
 @zznate
 
 Co-Founder  Sr. Technical Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

Re: Replication to second data center with different number of nodes

2015-03-28 Thread Colin Clark

I typically use a # a lot lower than 256, usually less than 20 for num_tokens
as a larger number has historically had a dramatic impact on query performance.
—
Colin Clark
co...@clark.ws
+1 612-859-6129
skype colin.p.clark

On Mar 28, 2015, at 3:46 PM, Eric Stevens migh...@gmail.com wrote:

If you're curious about how Cassandra knows how to replicate data in the
remote DC, it's the same as in the local DC, replication is independent in
each, and you can even set a different replication strategy per keyspace per
datacenter. Nodes in each DC take up num_tokens positions on a ring, each
partition key is mapped to a position on that ring, and whomever owns that
part of the ring is the primary for that data. Then (oversimplified) r-1
adjacent nodes become replicas for that same data.

On Fri, Mar 27, 2015 at 6:55 AM, Sibbald, Charles charles.sibb...@bskyb.com
mailto:charles.sibb...@bskyb.com wrote:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens

http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens

So go with a default 256, and leave initial token empty:

num_tokens: 256
# initial_token:

Cassandra will always give each node the same number of tokens, the only time
you might want to distribute this is if your instances are of different
sizing/capability which is also a bad scenario.

From: Björn Hachmann bjoern.hachm...@metrigo.de
mailto:bjoern.hachm...@metrigo.de
Reply-To: user@cassandra.apache.org mailto:user@cassandra.apache.org
user@cassandra.apache.org mailto:user@cassandra.apache.org
Date: Friday, 27 March 2015 12:11
To: user user@cassandra.apache.org mailto:user@cassandra.apache.org
Subject: Re: Replication to second data center with different number of nodes

2015-03-27 11:58 GMT+01:00 Sibbald, Charles charles.sibb...@bskyb.com
mailto:charles.sibb...@bskyb.com:
Cassandra’s Vnodes config

Thank you. Yes, we are using vnodes! The num_token parameter controls the
number of vnodes assigned to a specific node.

Might be I am seeing problems where are none.

Let me rephrase my question: How does Cassandra know it has to replicate 1/3
of all keys to each single node in the second DC? I can see two ways:
1. It has to be configured explicitly.
2. It is derived from the number of nodes available in the data center at
the time `nodetool rebuild` is started.

Kind regards
Björn
Information in this email including any attachments may be privileged,
confidential and is intended exclusively for the addressee. The views
expressed may not be official policy, but the personal views of the
originator. If you have received it in error, please notify the sender by
return e-mail and delete it from your system. You should not reproduce,
distribute, store, retransmit, use or disclose its contents to anyone. Please
note we reserve the right to monitor all e-mail communication through our
internal and external networks. SKY and the SKY marks are trademarks of Sky
plc and Sky International AG and are used under licence. Sky UK Limited
(Registration No. 2906991), Sky-In-Home Service Limited (Registration No.
2067075) and Sky Subscribers Services Limited (Registration No. 2340150) are
direct or indirect subsidiaries of Sky plc (Registration No. 2247735). All of
the companies mentioned in this paragraph are incorporated in England and
Wales and share the same registered office at Grant Way, Isleworth, Middlesex
TW7 5QD.

smime.p7s
Description: S/MIME cryptographic signature

Re: How to speed up SELECT * query in Cassandra

2015-02-11 Thread Colin

Did you want me to included specific examples from my employment at datastax or 
start from the ground up? 

All spark is on cassandra is a better than the previous use of hive. 

The fact that datastax hasnt provided any benchmarks themselves other than 
glossy marketing statements pretty much says it all-where are your benchmarks?  
Maybe you could combine it with the in memory option to really boogie...

:)

(If I find time, I might just write a blog post about exactly how to do this-it 
involves the use of parquet and partitioning with clustering-and it doesnt cost 
anything to do it-and it's in production at my company)
--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

 On Feb 11, 2015, at 6:51 AM, DuyHai Doan doanduy...@gmail.com wrote:
 
 The very nature of cassandra's distributed nature vs partitioning data on 
 hadoop makes spark on hdfs actually fasted than on cassandra
 
 Prove it. Did you ever have a look into the source code of the 
 Spark/Cassandra connector to see how data locality is achieved before 
 throwing out such statement ?
 
 On Wed, Feb 11, 2015 at 12:42 PM, Marcelo Valle (BLOOMBERG/ LONDON) 
 mvallemil...@bloomberg.net wrote:
  cassandra makes a very poor datawarehouse ot long term time series store
 
 Really? This is not the impression I have... I think Cassandra is good to 
 store larges amounts of data and historical information, it's only not good 
 to store temporary data.
 Netflix has a large amount of data and it's all stored in Cassandra, AFAIK. 
 
  The very nature of cassandra's distributed nature vs partitioning data on 
  hadoop makes spark on hdfs actually fasted than on cassandra.
 
 I am not sure about the current state of Spark support for Cassandra, but I 
 guess if you create a map reduce job, the intermediate map results will be 
 still stored in HDFS, as it happens to hadoop, is this right? I think the 
 problem with Spark + Cassandra or with Hadoop + Cassandra is that the hard 
 part spark or hadoop does, the shuffling, could be done out of the box with 
 Cassandra, but no one takes advantage on that. What if a map / reduce job 
 used a temporary CF in Cassandra to store intermediate results?
 
 From: user@cassandra.apache.org 
 Subject: Re: How to speed up SELECT * query in Cassandra
 I use spark with cassandra, and you dont need DSE.
 
 I see a lot of people ask this same question below (how do I get a lot of 
 data out of cassandra?), and my question is always, why arent you updating 
 both places at once?
 
 For example, we use hadoop and cassandra in conjunction with each other, we 
 use a message bus to store every event in both, aggregrate in both, but only 
 keep current data in cassandra (cassandra makes a very poor datawarehouse ot 
 long term time series store) and then use services to process queries that 
 merge data from hadoop and cassandra.  
 
 Also, spark on hdfs gives more flexibility in terms of large datasets and 
 performance.  The very nature of cassandra's distributed nature vs 
 partitioning data on hadoop makes spark on hdfs actually fasted than on 
 cassandra
 
 
 
 --
 Colin Clark 
 +1 612 859 6129
 Skype colin.p.clark
 
 On Feb 11, 2015, at 4:49 AM, Jens Rantil jens.ran...@tink.se wrote:
 
 
 On Wed, Feb 11, 2015 at 11:40 AM, Marcelo Valle (BLOOMBERG/ LONDON) 
 mvallemil...@bloomberg.net wrote:
 If you use Cassandra enterprise, you can use hive, AFAIK.
 
 Even better, you can use Spark/Shark with DSE.
 
 Cheers,
 Jens
 
 
 -- 
 Jens Rantil
 Backend engineer
 Tink AB
 
 Email: jens.ran...@tink.se
 Phone: +46 708 84 18 32
 Web: www.tink.se
 
 Facebook Linkedin Twitter

Re: How to speed up SELECT * query in Cassandra

2015-02-11 Thread Colin

No, the question isnt closed.  You dont get to decide that.

I dont run a website making claims regarding cassandra and spark - your 
employer does.   

Again, where are your benchmarks?

I will publish mine, then we'll see what you've got.

--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

 On Feb 11, 2015, at 8:39 AM, DuyHai Doan doanduy...@gmail.com wrote:
 
 For your information Colin: http://en.wikipedia.org/wiki/List_of_fallacies. 
 Look at Burden of proof
 
 You stated The very nature of cassandra's distributed nature vs partitioning 
 data on hadoop makes spark on hdfs actually fasted than on cassandra
 
 It's up to YOU to prove it right, not up to me to prove it wrong.
 
 All other bla bla is troll.
 
 Come back to me once you get some decent benchmarks supporting your 
 statement, until then, the question is closed.
 
 
 
 On Wed, Feb 11, 2015 at 3:17 PM, Colin co...@clark.ws wrote:
 Did you want me to included specific examples from my employment at datastax 
 or start from the ground up? 
 
 All spark is on cassandra is a better than the previous use of hive. 
 
 The fact that datastax hasnt provided any benchmarks themselves other than 
 glossy marketing statements pretty much says it all-where are your 
 benchmarks?  Maybe you could combine it with the in memory option to really 
 boogie...
 
 :)
 
 (If I find time, I might just write a blog post about exactly how to do 
 this-it involves the use of parquet and partitioning with clustering-and it 
 doesnt cost anything to do it-and it's in production at my company)
 --
 Colin Clark 
 +1 612 859 6129
 Skype colin.p.clark
 
 On Feb 11, 2015, at 6:51 AM, DuyHai Doan doanduy...@gmail.com wrote:
 
 The very nature of cassandra's distributed nature vs partitioning data on 
 hadoop makes spark on hdfs actually fasted than on cassandra
 
 Prove it. Did you ever have a look into the source code of the 
 Spark/Cassandra connector to see how data locality is achieved before 
 throwing out such statement ?
 
 On Wed, Feb 11, 2015 at 12:42 PM, Marcelo Valle (BLOOMBERG/ LONDON) 
 mvallemil...@bloomberg.net wrote:
  cassandra makes a very poor datawarehouse ot long term time series store
 
 Really? This is not the impression I have... I think Cassandra is good to 
 store larges amounts of data and historical information, it's only not 
 good to store temporary data.
 Netflix has a large amount of data and it's all stored in Cassandra, 
 AFAIK. 
 
  The very nature of cassandra's distributed nature vs partitioning data 
  on hadoop makes spark on hdfs actually fasted than on cassandra.
 
 I am not sure about the current state of Spark support for Cassandra, but 
 I guess if you create a map reduce job, the intermediate map results will 
 be still stored in HDFS, as it happens to hadoop, is this right? I think 
 the problem with Spark + Cassandra or with Hadoop + Cassandra is that the 
 hard part spark or hadoop does, the shuffling, could be done out of the 
 box with Cassandra, but no one takes advantage on that. What if a map / 
 reduce job used a temporary CF in Cassandra to store intermediate results?
 
 From: user@cassandra.apache.org 
 Subject: Re: How to speed up SELECT * query in Cassandra
 I use spark with cassandra, and you dont need DSE.
 
 I see a lot of people ask this same question below (how do I get a lot of 
 data out of cassandra?), and my question is always, why arent you updating 
 both places at once?
 
 For example, we use hadoop and cassandra in conjunction with each other, 
 we use a message bus to store every event in both, aggregrate in both, but 
 only keep current data in cassandra (cassandra makes a very poor 
 datawarehouse ot long term time series store) and then use services to 
 process queries that merge data from hadoop and cassandra.  
 
 Also, spark on hdfs gives more flexibility in terms of large datasets and 
 performance.  The very nature of cassandra's distributed nature vs 
 partitioning data on hadoop makes spark on hdfs actually fasted than on 
 cassandra
 
 
 
 --
 Colin Clark 
 +1 612 859 6129
 Skype colin.p.clark
 
 On Feb 11, 2015, at 4:49 AM, Jens Rantil jens.ran...@tink.se wrote:
 
 
 On Wed, Feb 11, 2015 at 11:40 AM, Marcelo Valle (BLOOMBERG/ LONDON) 
 mvallemil...@bloomberg.net wrote:
 If you use Cassandra enterprise, you can use hive, AFAIK.
 
 Even better, you can use Spark/Shark with DSE.
 
 Cheers,
 Jens
 
 
 -- 
 Jens Rantil
 Backend engineer
 Tink AB
 
 Email: jens.ran...@tink.se
 Phone: +46 708 84 18 32
 Web: www.tink.se
 
 Facebook Linkedin Twitter

Re: How to speed up SELECT * query in Cassandra

2015-02-11 Thread Colin

I use spark with cassandra, and you dont need DSE.

I see a lot of people ask this same question below (how do I get a lot of data 
out of cassandra?), and my question is always, why arent you updating both 
places at once?

For example, we use hadoop and cassandra in conjunction with each other, we use 
a message bus to store every event in both, aggregrate in both, but only keep 
current data in cassandra (cassandra makes a very poor datawarehouse ot long 
term time series store) and then use services to process queries that merge 
data from hadoop and cassandra.  

Also, spark on hdfs gives more flexibility in terms of large datasets and 
performance.  The very nature of cassandra's distributed nature vs partitioning 
data on hadoop makes spark on hdfs actually fasted than on cassandra



--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

 On Feb 11, 2015, at 4:49 AM, Jens Rantil jens.ran...@tink.se wrote:
 
 
 On Wed, Feb 11, 2015 at 11:40 AM, Marcelo Valle (BLOOMBERG/ LONDON) 
 mvallemil...@bloomberg.net wrote:
 If you use Cassandra enterprise, you can use hive, AFAIK.
 
 Even better, you can use Spark/Shark with DSE.
 
 Cheers,
 Jens
 
 
 -- 
 Jens Rantil
 Backend engineer
 Tink AB
 
 Email: jens.ran...@tink.se
 Phone: +46 708 84 18 32
 Web: www.tink.se
 
 Facebook Linkedin Twitter

Re: How to remove obsolete error message in Datastax Opscenter?

2015-02-09 Thread Colin

Stop using opscenter?

:)

Sorry, couldnt resist...

--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

 On Feb 9, 2015, at 3:01 AM, Björn Hachmann bjoern.hachm...@metrigo.de wrote:
 
 Good morning,
 
 unfortunately my last rolling restart of our Cassandra cluster issued from 
 OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is showing an 
 error message at the top of its screen:
 Error restarting cluster: Timed out waiting for Cassandra to start..
 
 Does anybody know how to remove that message permanently?
 
 Thank you very much in advance!
 
 Kind regards
 Björn Hachmann

Re: Mutable primary key in a table

2015-02-08 Thread Colin

Another way to do this is to use a time based uuid for the primary key 
(partition key) and to store the user name with that uuid.

In addition, you'll need 2 additonal tables, one that is used to get the uuid 
by user name and another to track user name changes over time which would be 
organized by uuid, and user name (cluster on the name).

This pattern is referred to as an inverted index and provides a lot of power 
and flexibility once mastered.  I use it all the time with cassandra - in fact, 
to be successful with cassandra, it might actually be a requirement!

--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

 On Feb 8, 2015, at 8:08 AM, Jack Krupansky jack.krupan...@gmail.com wrote:
 
 What is your full primary key? Specifically, what is the partition key, as 
 opposed to clustering columns?
 
 The point is that the partition key for a row is hashed to determine the 
 token for the partition, which in turn determines which node of the cluster 
 owns that partition. Changing the partition key means that potentially the 
 partition would need to be moved to another node, which is clearly not 
 something that Cassandra would do since the core design of Cassandra is that 
 all operations should be blazingly fast and to refrain from offering slow 
 features.
 
 I would recommend that your application:
 
 1. Read the existing user data
 2. Create a new user, using the existing user data.
 3. Update the old user row to indicate that it is no longer a valid user. 
 Actually, you will have to decide on an application policy for old user 
 names. For example, can they be reused, or are they locked, or... whatever.
 
 
 -- Jack Krupansky
 
 On Sun, Feb 8, 2015 at 1:48 AM, Ajaya Agrawal ajku@gmail.com wrote:
 
 On Sun, Feb 8, 2015 at 5:03 AM, Eric Stevens migh...@gmail.com wrote:
 I'm struggling to think of a model where it makes sense to update a primary 
 key as a typical operation.  It suggests, as Adil said, that you may be 
 reasoning wrong about your data model.  Maybe you can explain your problem 
 in more detail - what kind of thing has you updating your PK on a regular 
 basis?
 
 I have a 'user' table which has a column called 'user_name' and other 
 columns like name, city etc. The application requires that user_name be 
 unique and user should be searchable by 'user_name'. The only way to do this 
 in C* would be to make user_name column primary key. Things get trickier 
 when there is a requirement which says that user_name can be changed by the 
 users of the application. This a distributed application which mean that it 
 runs on multiple nodes. If I have to change user_name atomically then either 
 I need to implement distributed locking or use something C* provides.

Re: Mutable primary key in a table

2015-02-08 Thread Colin Clark

No need for CAS in my suggestion - I would try to avoid the use of CAS if at 
all possible.  

It’s better in a distributed environment to reduce dimensionality and isolate 
write/read paths (event sourcing and CQRS patterns).

Also, just in general, changing the primary key on an update is usually 
considered a bad idea and is simply not even permitted by most RDBMS.
—
Colin Clark
co...@clark.ws
+1 320-221-9531
skype colin.p.clark

 On Feb 8, 2015, at 4:16 PM, Eric Stevens migh...@gmail.com wrote:
 
 It sounds like changing user names is the kind of thing which doesn't happen 
 often, in which case you probably don't have to worry too much about the 
 additional overhead of using logged batches (not like you're going to be 
 doing hundreds to thousands of these per second).  You probably also want to 
 look into conditional updates (search for Compare And Set - CAS) to help 
 avoid collisions when creating or renaming users.
 
 Colin's suggestion of using a surrogate key for the primary key on the user 
 table is also a good idea, but you'll still want to use CAS to help maintain 
 the integrity of your data.  Note that CAS has a similar overhead to logged 
 batches in that it also involves a Paxos round.  So keep the number of 
 statements in either CAS or logged batches as minimal as possible.
 
 On Sun, Feb 8, 2015 at 7:17 AM, Colin co...@clark.ws 
 mailto:co...@clark.ws wrote:
 Another way to do this is to use a time based uuid for the primary key 
 (partition key) and to store the user name with that uuid.
 
 In addition, you'll need 2 additonal tables, one that is used to get the uuid 
 by user name and another to track user name changes over time which would be 
 organized by uuid, and user name (cluster on the name).
 
 This pattern is referred to as an inverted index and provides a lot of power 
 and flexibility once mastered.  I use it all the time with cassandra - in 
 fact, to be successful with cassandra, it might actually be a requirement!
 
 --
 Colin Clark 
 +1 612 859 6129 tel:%2B1%20612%20859%206129
 Skype colin.p.clark
 
 On Feb 8, 2015, at 8:08 AM, Jack Krupansky jack.krupan...@gmail.com 
 mailto:jack.krupan...@gmail.com wrote:
 
 What is your full primary key? Specifically, what is the partition key, as 
 opposed to clustering columns?
 
 The point is that the partition key for a row is hashed to determine the 
 token for the partition, which in turn determines which node of the cluster 
 owns that partition. Changing the partition key means that potentially the 
 partition would need to be moved to another node, which is clearly not 
 something that Cassandra would do since the core design of Cassandra is that 
 all operations should be blazingly fast and to refrain from offering slow 
 features.
 
 I would recommend that your application:
 
 1. Read the existing user data
 2. Create a new user, using the existing user data.
 3. Update the old user row to indicate that it is no longer a valid user. 
 Actually, you will have to decide on an application policy for old user 
 names. For example, can they be reused, or are they locked, or... whatever.
 
 
 -- Jack Krupansky
 
 On Sun, Feb 8, 2015 at 1:48 AM, Ajaya Agrawal ajku@gmail.com 
 mailto:ajku@gmail.com wrote:
 
 On Sun, Feb 8, 2015 at 5:03 AM, Eric Stevens migh...@gmail.com 
 mailto:migh...@gmail.com wrote:
 I'm struggling to think of a model where it makes sense to update a primary 
 key as a typical operation.  It suggests, as Adil said, that you may be 
 reasoning wrong about your data model.  Maybe you can explain your problem 
 in more detail - what kind of thing has you updating your PK on a regular 
 basis?
 
 I have a 'user' table which has a column called 'user_name' and other 
 columns like name, city etc. The application requires that user_name be 
 unique and user should be searchable by 'user_name'. The only way to do this 
 in C* would be to make user_name column primary key. Things get trickier 
 when there is a requirement which says that user_name can be changed by the 
 users of the application. This a distributed application which mean that it 
 runs on multiple nodes. If I have to change user_name atomically then either 
 I need to implement distributed locking or use something C* provides.   
 
 
 



smime.p7s
Description: S/MIME cryptographic signature

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Colin

The most data I put on a node with spinning disk is 1TB.

What are the machine specs? Cpu, memory, etc and what is the read/write 
pattern-heavy ingest rate/heavy read rate and how ling do you keep data in the 
cluster?

--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

 On Feb 8, 2015, at 2:44 PM, Jiri Horky ho...@avast.com wrote:
 
 Hi all,
 
 we are seeing quite high GC pressure (in old space by CMS GC Algorithm)
 on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap memory
 (2G for new space). The node runs fine for couple of days when the GC
 activity starts to raise and reaches about 15% of the C* activity which
 causes dropped messages and other problems.
 
 Taking a look at heap dump, there is about 8G used by SSTableReader
 classes in org.apache.cassandra.io.compress.CompressedRandomAccessReader.
 
 Is this something expected and we have just reached the limit of how
 many data a single Cassandra instance can handle or it is possible to
 tune it better?
 
 Regards
 Jiri Horky

Re: Cassandra on Ceph

2015-02-01 Thread Colin Taylor

Oops -  Nonetheless in on my environments  -  Nonetheless in *one of* my
environments

On 2 February 2015 at 16:12, Colin Taylor colin.tay...@gmail.com wrote:

 Thanks all for you input.

 I'm aware of the overlap, I'm aware I need to turn Ceph replication off,
 I'm aware this isn't ideal. Nonetheless in on my environments instead of
 raw disk to install C* on, I'm likely to just have Ceph storage. This is a
 fully managed environment (excepting for C*) and that's their standard.

 cheers
 Colin

 On 2 February 2015 at 14:42, Daniel Compton 
 daniel.compton.li...@gmail.com wrote:

 As Jan has already mentioned, Ceph and Cassandra do almost all of the
 same things. Replicated self healing data storage on commodity hardware
 without a SPOF describes both of these systems. If you did manage to get
 it running it would be a nightmare to reason about what's happening at the
 disk and network level.

 You're going to get write amplification by your replication factor of
 both Cassandra, and Ceph unless you turn one of them down. This impacts
 disk I/O, disk space, CPU, and network bandwidth. If you turned down Ceph
 replication I think it would be possible for all of the replicated data for
 some chunk to be stored on one node and be at risk of loss. E.g. 1x Ceph,
 3x Cassandra replication could store all 3 Cassandra replicas on the same
 Ceph node. 3x Ceph, 1x Cassandra would be safer, but presumably slower.

 Lastly Cassandra is designed around running against local disks, you will
 lose a lot of the advantages of this running it on Ceph.

 Daniel.

 On Mon, 2 Feb 2015 at 1:11 am Baskar Duraikannu 
 baskar.duraika...@outlook.com wrote:

  What is the reason for running Cassandra on Ceph? I have both running
 in my environment but doing different things - Cassandra as transactional
 store and Ceph as block storage for storing files.
  --
 From: Jan cne...@yahoo.com
 Sent: ‎2/‎1/‎2015 2:53 AM
 To: user@cassandra.apache.org
 Subject: Re: Cassandra on Ceph

   Colin;

  Ceph is a block based storage architecture based on RADOS.
 It comes with its own replication  rebalancing along with a map of the
 storage layer.

  Some more details  similarities:
  a)Ceph stores a client’s data as objects within storage pools.   (think
 of C* partitions)
  b) Using the CRUSH algorithm, Ceph calculates which placement group
 should contain the object, (C* primary keys  vnode data distribution)
  c) and further calculates which Ceph OSD Daemon should store the
 placement group   (C* node locality)
  d) The CRUSH algorithm enables the Ceph Storage Cluster to scale,
 rebalance, and recover dynamically (C* big table storage architecture).

 Summary:
 C*  comes with everything that Ceph provides (with the exception of
 block storage).
  There is no value add that Ceph brings to the table that C* does not
 already provide.
  I seriously doubt if C* could even work out of the box with yet another
 level of replication  rebalancing.

  Hope this helps
  Jan/

  C* Architect






   On Saturday, January 31, 2015 7:28 PM, Colin Taylor 
 colin.tay...@gmail.com wrote:


  I may be forced to run Cassandra on top of Ceph. Does anyone have
 experience / tips with this. Or alternatively, strong reasons why this
 won't work.

  cheers
 Colin

Re: Cassandra on Ceph

2015-02-01 Thread Colin Taylor

Thanks all for you input.

I'm aware of the overlap, I'm aware I need to turn Ceph replication off,
I'm aware this isn't ideal. Nonetheless in on my environments instead of
raw disk to install C* on, I'm likely to just have Ceph storage. This is a
fully managed environment (excepting for C*) and that's their standard.

cheers
Colin

On 2 February 2015 at 14:42, Daniel Compton daniel.compton.li...@gmail.com
wrote:

 As Jan has already mentioned, Ceph and Cassandra do almost all of the same
 things. Replicated self healing data storage on commodity hardware without
 a SPOF describes both of these systems. If you did manage to get it
 running it would be a nightmare to reason about what's happening at the
 disk and network level.

 You're going to get write amplification by your replication factor of both
 Cassandra, and Ceph unless you turn one of them down. This impacts disk
 I/O, disk space, CPU, and network bandwidth. If you turned down Ceph
 replication I think it would be possible for all of the replicated data for
 some chunk to be stored on one node and be at risk of loss. E.g. 1x Ceph,
 3x Cassandra replication could store all 3 Cassandra replicas on the same
 Ceph node. 3x Ceph, 1x Cassandra would be safer, but presumably slower.

 Lastly Cassandra is designed around running against local disks, you will
 lose a lot of the advantages of this running it on Ceph.

 Daniel.

 On Mon, 2 Feb 2015 at 1:11 am Baskar Duraikannu 
 baskar.duraika...@outlook.com wrote:

  What is the reason for running Cassandra on Ceph? I have both running
 in my environment but doing different things - Cassandra as transactional
 store and Ceph as block storage for storing files.
  --
 From: Jan cne...@yahoo.com
 Sent: ‎2/‎1/‎2015 2:53 AM
 To: user@cassandra.apache.org
 Subject: Re: Cassandra on Ceph

   Colin;

  Ceph is a block based storage architecture based on RADOS.
 It comes with its own replication  rebalancing along with a map of the
 storage layer.

  Some more details  similarities:
  a)Ceph stores a client’s data as objects within storage pools.   (think
 of C* partitions)
  b) Using the CRUSH algorithm, Ceph calculates which placement group
 should contain the object, (C* primary keys  vnode data distribution)
  c) and further calculates which Ceph OSD Daemon should store the
 placement group   (C* node locality)
  d) The CRUSH algorithm enables the Ceph Storage Cluster to scale,
 rebalance, and recover dynamically (C* big table storage architecture).

 Summary:
 C*  comes with everything that Ceph provides (with the exception of block
 storage).
  There is no value add that Ceph brings to the table that C* does not
 already provide.
  I seriously doubt if C* could even work out of the box with yet another
 level of replication  rebalancing.

  Hope this helps
  Jan/

  C* Architect






   On Saturday, January 31, 2015 7:28 PM, Colin Taylor 
 colin.tay...@gmail.com wrote:


  I may be forced to run Cassandra on top of Ceph. Does anyone have
 experience / tips with this. Or alternatively, strong reasons why this
 won't work.

  cheers
 Colin

Re: Question about use scenario with fulltext search

2015-02-01 Thread Colin

I use solr and cassandra but not together.  I write what I want indexed into 
solr (and only unstructured data), and related data into either cassandra or 
oracle.  I use the same key across all three db's.

When I need full text search etc, I read the data from solr, grab the keys, and 
go get the data from the other db's.

This avoids conflation of concerns, isolates failures, but is dependent upon 
multiple writes.  I use a message bus and services based approach.

In my experience, at scale this approach works better and avoids vendor lock in.

--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

 On Feb 2, 2015, at 7:25 AM, Asit KAUSHIK asitkaushikno...@gmail.com wrote:
 
 I tried elasticsearch but pulling up the data from Cassandra is a big pain. 
 The river pulls up all the the data everytime and no incremental approach. 
 Its a great product but i had to change my writing approach which i am just 
 doing in Cassandra from .net client .
 Also you have to create a separate infrastructure for elasticsearch.
 Agin this is what i found with limited analysis on elasticsearch
 
 Regards
 Asit
 
 
 On Mon, Feb 2, 2015 at 11:43 AM, Asit KAUSHIK asitkaushikno...@gmail.com 
 wrote:
 Also there is a project as Stargate-Core which gives the utility of querying 
 with wildcard characters. 
 the location is 
 https://github.com/tuplejump/stargate-core/releases/tag/0.9.9
 
 it supports the 2.0.11 version of cassandra..
 
 
 
 Also elasticsearch is another product but pumping the data from Cassandra is 
 a bad option in elasticsearch. You have to design you write such that you 
 write on both.
 But i am using the Stargate-Core personally  its very easy to implement and 
 use
 
 Hope this add a cent to you evaluation on this topic
 
 Regards
 Asit
 
 
 
 
 
 On Sun, Feb 1, 2015 at 10:45 PM, Mark Reddy mark.l.re...@gmail.com wrote:
 If you have a list of usernames stored in your cassandra database,
 how could you find all usernames starting with 'Jo'?
 
 Cassandra does not support full text search on its own, if you are looking 
 into DataStax enterprise Cassandra there is an integration with Slor that 
 gives you this functionality. 
 
 Personally for projects I work on that use Cassandra and require full text 
 search, the necessary data is indexed into Elasticsearch.
 
 Or ... if this is not possible,
 what are you using cassandra for?
 
 If you are looking for use cases here is a comprehensive set from companies 
 spanning many industries: 
 http://planetcassandra.org/apache-cassandra-use-cases/
 
 
 Regards,
 Mark
 
 On 1 February 2015 at 16:05, anton anto...@gmx.de wrote:
 Hi,
 
 I was just reading about cassandra and playing a little
 with it (using django www.djangoproject.com on the web server).
 
 One thing that I realized now is that fulltext search
 as in a normal sql statement (example):
 
   select name from users where name like 'Jo%';
 
 Simply does not work because this functionality does not exist.
 After reading and googeling and reading ...
 I still do not understand how I could use a db without this
 functionality (If I do not want to restrict myself on numerical data).
 
 So my question is:
 
 If you have a list of usernames stored in your cassandra database,
 how could you find all usernames starting with 'Jo'?
 
 
 Or ... if this is not possible,
 what are you using cassandra for?
 
 Actually I still did not get the point of how I could use cassandra :-(
 
 Anton

Cassandra on Ceph

2015-01-31 Thread Colin Taylor

I may be forced to run Cassandra on top of Ceph. Does anyone have
experience / tips with this. Or alternatively, strong reasons why this
won't work.

cheers
Colin

Re: number of replicas per data center?

2015-01-18 Thread Colin

I like to have 3 replicas across 3 racks in each datacenter as a rue of thumb.  
You can vary that, but it depends upon the use case, and the SLA's for latency.

This can get a little complicated if you're using the cloud and automated 
deployment strategies as I like to use the same abstractions externally as 
internally.

--
Colin Clark 
+1-320-221-9531
 

 On Jan 18, 2015, at 9:49 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 How do people normally setup multiple data center replication in terms of 
 number of *local* replicas?
 
 So say you have two data centers, do you have 2 local replicas, for a total 
 of 4 replicas?  Or do you have 2 in one datacenter, and 1 in another?
 
 If you only have one in a local datacenter then when it fails you have to 
 transfer all that data over the WAN.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile

Re: who owns the data?

2015-01-17 Thread Colin

Try running

Nodetool status system

By specifying a keyspace (system) in the command, you should get more 
meaningful results.  Using the command on keyspaces as you dev/test/etc will 
provide real results.

--
Colin Clark 
+1-320-221-9531
 

 On Jan 17, 2015, at 7:22 PM, Tim Dunphy bluethu...@gmail.com wrote:
 
 
 Hey all,
 
  I've setup a 3 node cassandra ring. All the nodes are reporting in and 
 appear to be working correctly. I have an RF setting of 3.
 
 However under the 'Owns' category all, all that I see if a '?'.
 
 
 [root@beta-new:~] #nodetool status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  OwnsHost ID   
 Rack
 UN  xx.xx.xx.xx 227.99 KB  256 ?   
 97c6e976-8cad-4f39-af21-00360f091b37  rack1
 UN  xx.xx.xx.xx  269.4 KB   256 ?   
 7014cf67-4632-41ee-b4f3-454874c3b402  rack1
 UN  xx.xx.xx.xx   422.91 KB  256 ?   
 14f0efab-3ca5-43f3-9576-b9054c5a2557  rack1
 
 I was just wondering if this might mean I have a configuration error in my 
 cassandra.yaml file. How can I see the % of data owned by each node?
 
 Thanks
 Tim
 -- 
 GPG me!!
 
 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: Best approach in Cassandra (+ Spark?) for Continuous Queries?

2015-01-03 Thread Colin

Use a message bus with a transactional get, get the message, send to cassandra, 
upon write success, submit to esp, commit get on bus.  Messaging systems like 
rabbitmq support this semantic.

Using cassandra as a queuing mechanism is an anti-pattern.

--
Colin Clark 
+1-320-221-9531
 

 On Jan 3, 2015, at 6:07 PM, Hugo José Pinto hugo.pi...@inovaworks.com wrote:
 
 Thank you all for your answers.
 
 It seems I'll have to go with some event-driven processing before/during the 
 Cassandra write path. 
 
 My concern would be that I'd love to first guarantee the disk write of the 
 Cassandra persistence and then do the event processing (which is mostly CRUD 
 intercepts at this point), even if slightly delayed, and doing so via 
 triggers would probably bog down the whole processing pipeline. 
 
 What I'd probably do is to write, in trigger, a separate key table with all 
 the CRUDed elements and to have the ESP process that table.
 
 Thank you for your contribution. Should anyone else have any experiende 
 experience in these scenarios I'm obviously all ears as well. 
 
 Best,
 
 Hugo 
 
 Enviado do meu iPhone
 
 No dia 03/01/2015, às 11:09, DuyHai Doan doanduy...@gmail.com escreveu:
 
 Hello Hugo
 
  I was facing the same kind of requirement from some users. Long story 
 short, below are the possible strategies with advantages and draw-backs of 
 each
 
 1) Put Spark in front of the back-end, every incoming 
 modification/update/insert goes into Spark first, then Spark will forward it 
 to Cassandra for persistence. With Spark, you can perform pre or 
 post-processing and notify external clients of mutation.
 
  The draw back of this solution is that all the incoming mutations must go 
 through Spark. You may set up a Kafka queue as temporary storage to 
 distribute the load and consume mutations with Spark but it add ups to the 
 architecture complexity with additional components  technologies
 
 2) For high availability and resilience, you probably want to have all 
 mutations saved first into Cassandra then process notifications with Spark. 
 In this case the only way to have notifications from Cassandra, as of 
 version 2.1, is to rely on manually coded triggers (which is still 
 experimental feature).
 
 With the triggers you can notify whatever clients you want, not only Spark.
 
 The big draw back of this solution is that playing with triggers is 
 dangerous if you are not familiar with Cassandra internals. Indeed the 
 trigger is on the write path and may hurt performance if you are doing 
 complex and blocking tasks.
 
 That's the 2 solutions I can see, maybe the ML members will propose other 
 innovative choices
 
  Regards
 
 On Sat, Jan 3, 2015 at 11:46 AM, Hugo José Pinto 
 hugo.pi...@inovaworks.com wrote:
 Hello.
 
 We're currently using Hazelcast (http://hazelcast.org/) as a distributed 
 in-memory data grid. That's been working sort-of-well for us, but going 
 solely in-memory has exhausted its path in our use case, and we're 
 considering porting our application to a NoSQL persistent store. After the 
 usual comparisons and evaluations, we're borderline close to picking 
 Cassandra, plus eventually Spark for analytics.
 
 Nonetheless, there is a gap in our architectural needs that we're still not 
 grasping how to solve in Cassandra (with or without Spark): Hazelcast 
 allows us to create a Continuous Query in that, whenever a row is 
 added/removed/modified from the clause's resultset, Hazelcast calls up back 
 with the corresponding notification. We use this to continuously update the 
 clients via AJAX streaming with the new/changed rows.
 
 This is probably a conceptual mismatch we're making, so - how to best 
 address this use case in Cassandra (with or without Spark's help)? Is there 
 something in the API that allows for Continuous Queries on key/clause 
 changes (haven't found it)? Is there some other way to get a stream of 
 key/clause updates? Events of some sort?
 
 I'm aware that we could, eventually, periodically poll Cassandra, but in 
 our use case, the client is potentially interested in a large number of 
 table clause notifications (think all changes to Ship positions on 
 California's coastline), and iterating out of the store would kill the 
 streamer's scalability.
 
 Hence, the magic question: what are we missing? Is Cassandra the wrong tool 
 for the job? Are we not aware of a particular part of the API or external 
 library in/outside the apache realm that would allow for this?
 
 Many thanks for any assistance!
 
 Hugo

Re: STCS limitation with JBOD?

2015-01-02 Thread Colin

Forcing a major compaction is usually a bad idea.  What is your reason for 
doing that?

--
Colin Clark 
+1-320-221-9531
 

 On Jan 2, 2015, at 1:17 PM, Dan Kinder dkin...@turnitin.com wrote:
 
 Hi,
 
 Forcing a major compaction (using nodetool compact) with STCS will result in 
 a single sstable (ignoring repair data). However this seems like it could be 
 a problem for large JBOD setups. For example if I have 12 disks, 1T each, 
 then it seems like on this node I cannot have one column family store more 
 than 1T worth of data (more or less), because all the data will end up in a 
 single sstable that can exist only on one disk. Is this accurate? The 
 compaction write path docs give a bit of hope that cassandra could split the 
 one final sstable across the disks, but I doubt it is able to and want to 
 confirm.
 
 I imagine that RAID/LLVM, using LCS, or multiple cassandra instances not in 
 JBOD mode could be solutions to this (with their own problems), but want to 
 verify that this actually is a problem.
 
 -dan

Re: Cassandra for Analytics?

2014-12-18 Thread Colin

Almost every stream processing system I know of offers joins out of the box and 
has done so for years

Even open source offerings like Esper have offered joins for years.

What hasnt are systems like storm, spark, etc which I dont really classify as 
stream processors anyway.



--
Colin Clark 
+1-320-221-9531
 

 On Dec 18, 2014, at 1:52 PM, Peter Lin wool...@gmail.com wrote:
 
 that depends on what you mean by real-time analytics.
 
 For things like continuous data streams, neither are appropriate platforms 
 for doing analytics. They're good for storing the results (aka output) of the 
 streaming analytics. I would suggest before you decide cassandra vs hbase, 
 first figure out exactly what kind of analytics you need to do. Start with 
 prototyping and look at what kind of queries and patterns you need to support.
 
 neither hbase or cassandra are good for complex patterns that do joins or 
 cross joins (aka mdx), so using either one you have to re-invent stuff.
 
 most of the event processing and stream processing products out there also 
 don't support joins or cross joins very well, so any solution is going to 
 need several different components. typically stream processing does 
 filtering, which feeds another system that does simple joins. The output of 
 the second step can then go to another system that does mdx style queries.
 
 spark streaming has basic support, but it's not as mature and feature rich as 
 other stream processing products.
 
 On Wed, Dec 17, 2014 at 11:20 PM, Ajay ajay.ga...@gmail.com wrote:
 Hi,
 
 Can Cassandra be used or best fit for Real Time Analytics? I went through 
 couple of benchmark between Cassandra Vs HBase (most of it was done 3 years 
 ago) and it mentioned that Cassandra is designed for intensive writes and 
 Cassandra has higher latency for reads than HBase. In our case, we will have 
 writes and reads (but reads will be more say 40% writes and 60% reads). We 
 are planning to use Spark as the in memory computation engine.
 
 Thanks
 Ajay

Re: sstables keep growing on cassandra 2.1

2014-11-19 Thread Colin Kuo

Hi,

Can you please firstly check the nodetool compactionstats during repair?
I'm afraid that minor compaction may be blocked by whatever tasks that
causes the number of SStable keep growing.

On Sat, Nov 15, 2014 at 7:47 AM, James Derieg james.der...@uplynk.com
wrote:

 Hi everyone,
 I'm hoping someone can help me with a weird issue on Cassandra 2.1.
 The sstables on my cluster keep growing to a huge number when I run a
 nodetool repair.  On the attached graph, I ran a manual 'nodetool compact'
 on each node in the cluster, which brought them back down to a low number
 of sstables.  Then I immediately ran a nodetool repair, and the sstables
 jumped back up.  Has anyone seen this behavior?  Is this expected? I have
 some 2.0 clusters in the same environment, and they don't do this.
 Thanks in advance for your help.
 ᐧ

Re: Repair/Compaction Completion Confirmation

2014-10-28 Thread Colin

When I use virtual nodes, I typically use a much smaller number - usually in 
the range of 10.  This gives me the ability to add nodes easier without the 
performance hit.



--
Colin Clark 
+1-320-221-9531
 

 On Oct 28, 2014, at 10:46 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:
 
 I have been trying this yesterday too.
 
 https://github.com/BrianGallew/cassandra_range_repair
 
 Not 100% bullet proof -- Indeed I found that operations are done multiple 
 times, so it is not very optimised. Though it is open sourced so I guess you 
 can improve things as much as you want and contribute. Here is the issue I 
 raised yesterday 
 https://github.com/BrianGallew/cassandra_range_repair/issues/14.
 
 I am also trying to improve our repair automation since we now have multiple 
 DC and up to 800 GB per node. Repairs are quite heavy right now.
 
 Good luck,
 
 Alain
 
 2014-10-28 4:59 GMT+01:00 Ben Bromhead b...@instaclustr.com:
 https://github.com/BrianGallew/cassandra_range_repair
 
 This breaks down the repair operation into very small portions of the ring 
 as a way to try and work around the current fragile nature of repair. 
 
 Leveraging range repair should go some way towards automating repair (this 
 is how the automatic repair service in DataStax opscenter works, this is how 
 we perform repairs).
 
 We have had a lot of success running repairs in a similar manner against 
 vnode enabled clusters. Not 100% bullet proof, but way better than nodetool 
 repair 
 
 
 
 On 28 October 2014 08:32, Tim Heckman t...@pagerduty.com wrote:
 On Mon, Oct 27, 2014 at 1:44 PM, Robert Coli rc...@eventbrite.com wrote:
 On Mon, Oct 27, 2014 at 1:33 PM, Tim Heckman t...@pagerduty.com wrote:
 I know that when issuing some operations via nodetool, the command blocks 
 until the operation is finished. However, is there a way to reliably 
 determine whether or not the operation has finished without monitoring 
 that invocation of nodetool?
 
 In other words, when I run 'nodetool repair' what is the best way to 
 reliably determine that the repair is finished without running something 
 equivalent to a 'pgrep' against the command I invoked? I am curious about 
 trying to do the same for major compactions too.
 
 This is beyond a FAQ at this point, unfortunately; non-incremental repair 
 is awkward to deal with and probably impossible to automate. 
 
 In The Future [1] the correct solution will be to use incremental repair, 
 which mitigates but does not solve this challenge entirely.
 
 As brief meta commentary, it would have been nice if the project had spent 
 more time optimizing the operability of the critically important thing you 
 must do once a week [2].
 
 https://issues.apache.org/jira/browse/CASSANDRA-5483
 
 =Rob
 [1] http://www.datastax.com/dev/blog/anticompaction-in-cassandra-2-1
 [2] Or, more sensibly, once a month with gc_grace_seconds set to 34 days.
 
 Thank you for getting back to me so quickly. Not the answer that I was 
 secretly hoping for, but it is nice to have confirmation. :)
 
 Cheers!
 -Tim 
 
 
 
 -- 
 Ben Bromhead
 
 Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359

Re: opscenter with community cassandra

2014-10-28 Thread Colin

I cant run opscenter in a secure environment for a couple of reasons, one - it 
phones home, two - lack of role based security.

It is a mistake to call a proprietary piece of software community when you cant 
use it in production.

It is easy enough to automate what opscenter does rather than relying in a 
third party in my enterprise,.



 On Oct 28, 2014, at 10:04 AM, Josh Smith josh.sm...@careerbuilder.com wrote:
 
 Yes Opscenter does work with the opensource version of Cassandra. I am 
 currently running both in the cloud and our private datacenter with no 
 problems. I have not tried 2.1.1 yet but I do not see why it wouldn’t work 
 also.
  
 Josh
  
 From: Tim Dunphy [mailto:bluethu...@gmail.com] 
 Sent: Tuesday, October 28, 2014 10:43 AM
 To: user@cassandra.apache.org
 Subject: opscenter with community cassandra
  
 Hey all,
  
  I'd like to setup datastax opscenter to monitor my cassandra ring. However 
 I'm using the open source version of 2.1.1. And before I expend any time and 
 effort in setting this up, I'm wondering if it will work with the open source 
 version? Or would I need to be running datastax cassandra in order to get 
 this going?
  
 Thanks
 Tim
  
 -- 
 GPG me!!
 
 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: opscenter with community cassandra

2014-10-28 Thread Colin

No, actually, you cant Tyler.

If you mean the useless information it provides outside of licence, fine,  if 
you mean the components outside, then same argument.

Last time i checked, this forumn was about apache and not about datastax.  
Maybe a separate group should be deducated to provider specific offerings.

--
Colin Clark 
+1-320-221-9531
 

 On Oct 28, 2014, at 10:41 AM, Tyler Hobbs ty...@datastax.com wrote:
 
 
 On Tue, Oct 28, 2014 at 10:08 AM, Colin colpcl...@gmail.com wrote:
 It is a mistake to call a proprietary piece of software community when you 
 cant use it in production.
 
 You can use OpsCenter community in production (however you'd like).
 
 
 -- 
 Tyler Hobbs
 DataStax

Re: opscenter with community cassandra

2014-10-28 Thread Colin

Ken, go ahead and check the difference in functionality between licensed and 
not, and what it takes to run a cluster and then get back to me.

Not only did I used to work for datastax, but before then was the lead 
architecht for a very large projecy using cassandra.  Please dont confuse 
apache cassandra with datatstax proprietary offerings.

And now I will tell you that, in a blanket statement, opscenter *IS NOT READY 
FOR A PRODUCTION ENVIRONMENT*

Your mileage may vary, and you might not take security very seriously, so go 
head and expose your cluster.

Enjoy!
--
Colin Clark 
+1-320-221-9531
 

 On Oct 28, 2014, at 10:52 AM, Ken Hancock ken.hanc...@schange.com wrote:
 
 Your criteria for what is appropriate for production may differ from others, 
 but it's equally incorrect of you to make a blanket statement that OpsCenter 
 isn't suitable for production.  A number of people use it in production.
 
 
 
 On Tue, Oct 28, 2014 at 11:48 AM, Colin co...@clark.ws wrote:
 No, actually, you cant Tyler.
 
 If you mean the useless information it provides outside of licence, fine,  
 if you mean the components outside, then same argument.
 
 Last time i checked, this forumn was about apache and not about datastax.  
 Maybe a separate group should be deducated to provider specific offerings.
 
 --
 Colin Clark 
 +1-320-221-9531
  
 
 On Oct 28, 2014, at 10:41 AM, Tyler Hobbs ty...@datastax.com wrote:
 
 
 On Tue, Oct 28, 2014 at 10:08 AM, Colin colpcl...@gmail.com wrote:
 It is a mistake to call a proprietary piece of software community when you 
 cant use it in production.
 
 You can use OpsCenter community in production (however you'd like).
 
 
 -- 
 Tyler Hobbs
 DataStax
 
 
 
 -- 
 Ken Hancock | System Architect, Advanced Advertising 
 SeaChange International 
 50 Nagog Park
 Acton, Massachusetts 01720
 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC 
 Office: +1 (978) 889-3329 |  ken.hanc...@schange.com | hancockks | hancockks  
 
 
 This e-mail and any attachments may contain information which is SeaChange 
 International confidential. The information enclosed is intended only for the 
 addressees herein and may not be copied or forwarded without permission from 
 SeaChange International.

Re: decommissioning a cassandra node

2014-10-27 Thread Colin Kuo

Hi Tim,

The node with IP 94 is leaving. Maybe something wrong happens during
streaming data. You could use nodetool netstats on both nodes to monitor
if there is any streaming connection stuck.

Indeed, you could force remove the leaving node by shutting down it
directly. Then, perform nodetool removenode to remove dead node. But you
should understand you're taking the risk to lose data if your RF in cluster
is lower than 3 and data have not been fully synced. Therefore, remember to
sync data using repair before you're going to remove/decommission the node
in cluster.

Thanks!

On Mon, Oct 27, 2014 at 9:55 PM, Tim Dunphy bluethu...@gmail.com wrote:

 Also, is there any document that explains what all the nodetool
 abbreviations (UN, UL) stand for?
 -- The documentation is in the command output itself
 Datacenter: datacenter1
 ===

 *Status=Up/Down*
 *|/ State=Normal/Leaving/Joining/Moving*--  Address Load
 Tokens  OwnsHost ID   Rack
 UN  162.243.86.41   1.08 MB1   0.1%
  e945f3b5-2e3e-4a20-b1bd-e30c474a7634  rack1
 UL  162.243.109.94  1.28 MB256 99.9%
 fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
 U = Up, D = Down
 N = Normal, L = Leaving, J = Joining and M = Moving


 Ok, got it, thanks!

 Can someone suggest a good way to fix a node that is in an UL state?

 Thanks
 Tim

 On Mon, Oct 27, 2014 at 9:46 AM, DuyHai Doan doanduy...@gmail.com wrote:

 Also, is there any document that explains what all the nodetool
 abbreviations (UN, UL) stand for?

 -- The documentation is in the command output itself

 Datacenter: datacenter1
 ===
 *Status=Up/Down*
 *|/ State=Normal/Leaving/Joining/Moving*
 --  Address Load   Tokens  OwnsHost ID
 Rack
 UN  162.243.86.41   1.08 MB1   0.1%
  e945f3b5-2e3e-4a20-b1bd-e30c474a7634  rack1
 UL  162.243.109.94  1.28 MB256 99.9%
 fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1

 U = Up, D = Down
 N = Normal, L = Leaving, J = Joining and M = Moving

 On Mon, Oct 27, 2014 at 2:42 PM, Tim Dunphy bluethu...@gmail.com wrote:

 As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is
 causing the problem


 OK, that's an interesting observation.How do you fix a node that is an
 UL state? What causes this?

 Also, is there any document that explains what all the nodetool
 abbreviations (UN, UL) stand for?

 On Mon, Oct 27, 2014 at 5:46 AM, jivko donev jivko_...@yahoo.com
 wrote:

 As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is
 causing the problem.


   On Sunday, October 26, 2014 11:57 PM, Tim Dunphy 
 bluethu...@gmail.com wrote:


 Hey all,

  I'm trying to decommission a node.

  First I'm getting a status:

 [root@beta-new:/usr/local] #nodetool status
 Note: Ownership information does not include topology; for complete
 information, specify a keyspace
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address Load   Tokens  OwnsHost ID
   Rack
 UN  162.243.86.41   1.08 MB1   0.1%
  e945f3b5-2e3e-4a20-b1bd-e30c474a7634  rack1
 UL  162.243.109.94  1.28 MB256 99.9%
 fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1


 But when I try to decommission the node I get this message:

 [root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommission
 nodetool: Failed to connect to '162.243.86.41:7199' -
 NoSuchObjectException: 'no such object in table'.

 Yet I can telnet to that host on that port just fine:

 [root@beta-new:/usr/local] #telnet 162.243.86.41 7199
 Trying 162.243.86.41...
 Connected to 162.243.86.41.
 Escape character is '^]'.


 And I have verified that cassandra is running and accessible via cqlsh
 on the other machine.

 What could be going wrong?

 Thanks
 Tim


 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B






 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B





 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: which snitch ?

2014-10-26 Thread Colin

I would try propertyfilesnitch and use the public ip's of the nodes in aws.  
You'll need to set the configuration files on each node.




 On Oct 26, 2014, at 9:44 PM, Srinivas Chamarthi 
 srinivas.chamar...@gmail.com wrote:
 
 what about for the nodes on the private cloud cluster ? if I mention, 
 ec2MultiRegion, it is failing since it is trying to invoking aws api on the 
 node inside the snitch. should I mention GossipingPropertyFileSnitch ? I am 
 not sure if I can mix and match. can someone advise me ? 
 
 thx
 srinivas
 
 On Sat, Oct 25, 2014 at 9:13 PM, Arun arunsi...@gmail.com wrote:
 Srinivas,
 
 Use ec2multiregion snitch.
 
 
  On Oct 25, 2014, at 19:47, Srinivas Chamarthi 
  srinivas.chamar...@gmail.com wrote:
 
  I am using a datacenter in private network and want to replicate data to 
  an ec2 data center. I am confused which snitch to use so that my data in 
  the dc1 is replicated to dc2 in ec2 ?  any help is greatly appreciated
 
  thx
  srinivas

Re: Dynamic schema modification an anti-pattern?

2014-10-07 Thread Colin

Anti-pattern.  Dynamically altering the schema won't scale and is bad ju ju.

--
Colin Clark 
+1-320-221-9531
 

 On Oct 6, 2014, at 10:56 PM, Todd Fast t...@toddfast.com wrote:
 
 There is a team at my work building a entity-attribute-value (EAV) store 
 using Cassandra. There is a column family, called Entity, where the partition 
 key is the UUID of the entity, and the columns are the attributes names with 
 their values. Each entity will contain hundreds to thousands of attributes, 
 out of a list of up to potentially ten thousand known attribute names.
 
 However, instead of using wide rows with dynamic columns (and serializing 
 type info with the value), they are trying to use a static column family and 
 modifying the schema dynamically as new named attributes are created.
 
 (I believe one of the main drivers of this approach is to use collection 
 columns for certain attributes, and perhaps to preserve type metadata for a 
 given attribute.)
 
 This approach goes against everything I've seen and done in Cassandra, and is 
 generally an anti-pattern for most persistence stores, but I want to gather 
 feedback before taking the next step with the team.
 
 Do others consider this approach an anti-pattern, and if so, what are the 
 practical downsides?
 
 For one, this means that the Entity schema would contain the superset of all 
 columns for all rows. What is the impact of having thousands of columns names 
 in the schema? And what are the implications of modifying the schema 
 dynamically on a decent sized cluster (5 nodes now, growing to 10s later) 
 under load?
 
 Thanks,
 Todd

Re: unable to load data using sstableloader

2014-07-28 Thread Colin Kuo

Have you created the schema for these data files? I meant the schema should
be created before you load these data file to C*.

Here is the article for introduction of sstableloader that you could refer.
http://www.datastax.com/documentation/cassandra/1.2/cassandra/tools/toolsBulkloader_t.html



On Mon, Jul 28, 2014 at 7:28 PM, Akshay Ballarpure 
akshay.ballarp...@tcs.com wrote:


 Hello,
 I am unable to load sstable into cassandra using sstable loader, please
 suggest. Thanks.

 [root@CSL-simulation conf]# pwd
 /root/Akshay/Cassandra/apache-cassandra-2.0.8/conf
 [root@CSL-simulation conf]# ls -ltr keyspace/col/
 total 32
 -rw-r--r-- 1 root root   16 Jul 28 16:55 Test-Data-jb-1-Filter.db
 -rw-r--r-- 1 root root  300 Jul 28 16:55 Test-Data-jb-1-Index.db
 -rw-r--r-- 1 root root 3470 Jul 28 16:55 Test-Data-jb-1-Data.db
 -rw-r--r-- 1 root root8 Jul 28 16:55 Test-Data-jb-1-CRC.db
 -rw-r--r-- 1 root root   64 Jul 28 16:55 Test-Data-jb-1-Digest.sha1
 -rw-r--r-- 1 root root 4392 Jul 28 16:55 Test-Data-jb-1-Statistics.db
 -rw-r--r-- 1 root root   79 Jul 28 16:55 Test-Data-jb-1-TOC.txt


 [root@CSL-simulation conf]# ../bin/sstableloader -d localhost
 /root/Akshay/Cassandra/apache-cassandra-2.0.8/conf/keyspace/col/ --debug
 Could not retrieve endpoint ranges:
 InvalidRequestException(why:No such keyspace: keyspace)
 java.lang.RuntimeException: Could not retrieve endpoint ranges:
 at
 org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:259)
 at
 org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:149)
 at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:85)
 Caused by: InvalidRequestException(why:No such keyspace: keyspace)
 at
 org.apache.cassandra.thrift.Cassandra$describe_ring_result$describe_ring_resultStandardScheme.read(Cassandra.java:34055)
 at
 org.apache.cassandra.thrift.Cassandra$describe_ring_result$describe_ring_resultStandardScheme.read(Cassandra.java:34022)
 at
 org.apache.cassandra.thrift.Cassandra$describe_ring_result.read(Cassandra.java:33964)
 at
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
 at
 org.apache.cassandra.thrift.Cassandra$Client.recv_describe_ring(Cassandra.java:1251)
 at
 org.apache.cassandra.thrift.Cassandra$Client.describe_ring(Cassandra.java:1238)
 at
 org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:235)
 ... 2 more


 Thanks  Regards
 Akshay Ghanshyam Ballarpure
 Tata Consultancy Services
 Cell:- 9985084075
 Mailto: akshay.ballarp...@tcs.com
 Website: http://www.tcs.com
 
 Experience certainty.IT Services
Business Solutions
Consulting
 

 =-=-=
 Notice: The information contained in this e-mail
 message and/or attachments to it may contain
 confidential or privileged information. If you are
 not the intended recipient, any dissemination, use,
 review, distribution, printing or copying of the
 information contained in this e-mail message
 and/or attachments to it are strictly prohibited. If
 you have received this communication in error,
 please notify us by reply e-mail or telephone and
 immediately and permanently delete the message
 and any attachments. Thank you

Re: 2x disk space required for full compaction? Don't vnodes help this problem?

2014-07-24 Thread Colin Clark

Triggering a major compaction is usually not a good idea.

If you've got ssd's, go leveled as DuyHai says.  The results will be tasty.

--
Colin
320-221-9531


On Jul 24, 2014, at 5:28 PM, Kevin Burton bur...@spinn3r.com wrote:

This was after a bootstrap… so I triggered a major compaction.  Should I
just turn on leveled compaction and then never do a major compaction?


On Thu, Jul 24, 2014 at 3:09 PM, DuyHai Doan doanduy...@gmail.com wrote:

 If you're using SizeTieredCompactionStrategy the disk space may double
 temporarily during compaction. That's one of the big drawback of
 SizedTiered. Since you're on SSD, why not test switching  to
 LeveledCompaction ? Put a node on write survey mode to see if this change
 has any impact on your I/O, CPU and node stability.


 On Thu, Jul 24, 2014 at 11:56 PM, Kevin Burton bur...@spinn3r.com wrote:

 I just bootstrapped a new node.

 The box had about 220GB of data on it on a 400GB SSD drive.

 I triggered a full compaction after it bootstrapped, and it ran out of
 disk space about 15 minutes later.  so now that node is dead :-(

 I would have assumed that vnodes meant that I could keep my drive near
 100% full…

 so during a major compaction it would just compact the first vnode, then
 move on to the second.

 this would be analogous to bigtable / hbase regions.

 but … that doesn't seem to be the case.  (so bad assumption on my part)
 Both in terms of me actually seeing the disk fill up, and also the case of
 my disk not having separate SSTables for each vnode.

 So now I have these SSDs that I have to keep at  50% capacity at all
 times.

 I can see why on HDDs having too many files would be an issue.

 But on SSDs this is less of a problem.

 Perhaps some hybrid where vnodes are chunked together in one contiguous
 region?

 Is there a way to fix this problem? I would like to get more usage out of
 my SSDs...

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com





-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com

Re: Fetching ONE cell with a row cache hit takes 1 second on an idle box?

2014-07-01 Thread Colin

Rowcache is typically turned off because it is only useful in very specific 
situations-the row(s) need to fit in memory.  Also, the access patterns have to 
fit.

If all the rows you're accessing can fit, Rowcache is a great thing. Otherwise, 
not so great.

--
Colin
320-221-9531


 On Jul 1, 2014, at 10:40 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 WOW.. so based on your advice, and a test, I disabled the row cache for the 
 table.
 
 The query was instantly 20x faster.
 
 so this is definitely an anti-pattern.. I suspect cassandra just tries to 
 read in they entire physical row into memory and since my physical row is 
 rather big.. ha.  Well that wasn't very fun :)
 
 BIG win though ;)
 
 
 On Tue, Jul 1, 2014 at 6:52 PM, Kevin Burton bur...@spinn3r.com wrote:
 A work around for this, is the VFS page cache.. basically, disabling 
 compression, and then allowing the VFS page cache to keep your data in 
 memory.
 
 The only downside is the per column overhead.  But if you can store 
 everything in a 'blob' which is optionally compressed, you're generally 
 going to be ok.
 
 Kevin
 
 
 On Tue, Jul 1, 2014 at 6:50 PM, Kevin Burton bur...@spinn3r.com wrote:
 so.. caching the *queries* ?
 
 it seems like a better mechanism would be to cache the actually logical 
 row…, not the physical row.  
 
 Query caches just don't work in production,  If you re-word your query, or 
 structure it a different way, you get a miss…
 
 In my experience.. query caches have a 0% hit rate.
 
 
 On Tue, Jul 1, 2014 at 6:40 PM, Robert Coli rc...@eventbrite.com wrote:
 On Tue, Jul 1, 2014 at 6:06 PM, Kevin Burton bur...@spinn3r.com wrote:
 you know.. one thing I failed to mention.. .is that this is going into a 
 bucket and while it's a logical row, the physical row is like 500MB … 
 according to compaction logs.
 
 is the ENTIRE physical row going into the cache as one unit?  That's 
 definitely going to be a problem in this model.  500MB is a big atomic 
 unit.
 
 Yes, the row cache is a row cache. It caches what the storage engine calls 
 rows, which CQL calls partitions. [1] Rows have to be assembled from all 
 of their row fragments in Memtables/SSTables.
 
 This is a big part of why the off-heap row cache's behavior of 
 invalidation on write is so bad for its overall performance. Updating a 
 single column in your 500MB row invalidates it and forces you to assemble 
 the entire 500MB row from disk. 
 
 The only valid use case for the current off-heap row cache seems to be : 
 very small, very uniform in size, very hot, and very rarely modified.
 
 https://issues.apache.org/jira/browse/CASSANDRA-5357
 
 Is the ticket for replacing the row cache and its unexpected 
 characteristics with something more like an actual query cache.
 
 also.. I assume it's having to do a binary search within the physical row 
 ? 
 
 Since the column level bloom filter's removal in 1.2, the only way it can 
 get to specific columns is via the index.
 
 =Rob
 [1] https://issues.apache.org/jira/browse/CASSANDRA-6632
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile

Re: Advice on how to handle corruption in system/hints

2014-06-09 Thread Colin Kuo

Hi Francois,

We're facing the same issue like yours. The approach we did is to

1. scrub that corrupted data file
2. repair that column family

Immediately delete that corrupted files is not suggested if C* instance is
running.
This might be happening if bad disk or power outage.

Thanks,

Colin


http://about.me/ColinKuo
Colin Kuo
about.me/ColinKuo
[image: Colin Kuo on about.me]

http://about.me/ColinKuo


On Mon, Jun 9, 2014 at 6:11 AM, Francois Richard frich...@yahoo-inc.com
wrote:

  Hi everyone,

  We are running some Cassandra clusters (Usually a cluster of 5 nodes
 with replication factor of 3.)  And at least once per day we do see some
 corruption related to a specific sstable in system/hints. (We are using
 Cassandra version 1.2.16 on RHEL 6.5)

  Here is an example of such exception:

   ERROR [CompactionExecutor:1694] 2014-06-08 21:37:33,267
 CassandraDaemon.java (line 191) Exception in thread
 Thread[CompactionExecutor:1694,1,main]

 org.apache.cassandra.io.sstable.CorruptSSTableException:
 java.io.IOException: dataSize of 8224262783474088549 starting at 502360510
 would be larger than file /home/y/var/cassandra/data/syste

 m/hints/system-hints-ic-281-Data.db length 504590769

 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:167)

 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:83)

 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)

 at
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)

 at
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)

 at
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)

 at
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)

 at
 org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:145)

 at
 org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)

 at
 org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)

 at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)

 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)

 at
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:145)

 at
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)

 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)

 at
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)

 at
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)

 at
 org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(CompactionManager.java:442)

 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)

 at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

 at java.util.concurrent.FutureTask.run(FutureTask.java:262)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

 at java.lang.Thread.run(Thread.java:745)

 Caused by: java.io.IOException: dataSize of 8224262783474088549 starting
 at 502360510 would be larger than file
 /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length
 504590769

 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:123)

 ... 23 more

  INFO [HintedHandoff:35] 2014-06-08 21:37:33,267
 HintedHandOffManager.java (line 296) Started hinted handoff for host:
 502a48cd-171b-4e83-a9ad-67f32437353a with IP: /10.210.239.190

 ERROR [HintedHandoff:33] 2014-06-08 21:37:33,267 CassandraDaemon.java
 (line 191) Exception in thread Thread[HintedHandoff:33,1,main]

 java.lang.RuntimeException: java.util.concurrent.ExecutionException:
 org.apache.cassandra.io.sstable.CorruptSSTableException:
 java.io.IOException: dataSize of 8224262783474088549 starting at 502360510
 would be larger than file
 /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length
 504590769

 at
 org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:441)

 at
 org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:282)

 at
 org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:90)

 at
 org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:508)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java

Re: high pending compactions

2014-06-09 Thread Colin Kuo

As Jake suggested, you could firstly increase
compaction_throughput_mb_per_sec and concurrent_compactions to suitable
values if system resource is allowed. From my understanding, major
compaction will internally acquire lock before running compaction. In your
case, there might be a major compaction blocking the pending following
compaction tasks. You could check the result of nodetool compactionstats
and C* system log for double confirm.

If the running compaction is compacting wide row for a long time, you could
try to tune in_memory_compaction_limit_in_mb value.

Thanks,



On Sun, Jun 8, 2014 at 11:27 PM, S C as...@outlook.com wrote:

 I am using Cassandra 1.1 (sorry bit old) and I am seeing high pending
 compaction count. pending tasks: 67 while active compaction tasks are
 not more than 5. I have a 24CPU machine. Shouldn't I be seeing more
 compactions? Is this a pattern of high writes and compactions backing up?
 How can I improve this? Here are my thoughts.


1. Increase memtable_total_space_in_mb
2. Increase compaction_throughput_mb_per_sec
3. Increase concurrent_compactions


 Sorry if this was discussed already. Any pointers is much appreciated.

 Thanks,
 Kumar

Re: How to restart bootstrap after a failed streaming due to Broken Pipe (1.2.16)

2014-06-09 Thread Colin Kuo

You can use nodetool repair instead. Repair is able to re-transmit the
data which belongs to new node.



On Tue, Jun 10, 2014 at 10:40 AM, Mike Heffner m...@librato.com wrote:

 Hi,

 During an attempt to bootstrap a new node into a 1.2.16 ring the new node
 saw one of the streaming nodes periodically disappear:

  INFO [GossipTasks:1] 2014-06-10 00:28:52,572 Gossiper.java (line 823)
 InetAddress /10.156.1.2 is now DOWN
 ERROR [GossipTasks:1] 2014-06-10 00:28:52,574 AbstractStreamSession.java
 (line 108) Stream failed because /10.156.1.2 died or was
 restarted/removed (streams may still be active in background, but further
 streams won't be started)
  WARN [GossipTasks:1] 2014-06-10 00:28:52,574 RangeStreamer.java (line
 246) Streaming from /10.156.1.2 failed
  INFO [HANDSHAKE-/10.156.1.2] 2014-06-10 00:28:57,922
 OutboundTcpConnection.java (line 418) Handshaking version with /10.156.1.2
  INFO [GossipStage:1] 2014-06-10 00:28:57,943 Gossiper.java (line 809)
 InetAddress /10.156.1.2 is now UP

 This brief interruption was enough to kill the streaming from node
 10.156.1.2. Node 10.156.1.2 saw a similar broken pipe exception from the
 bootstrapping node:

 ERROR [Streaming to /10.156.193.1.3] 2014-06-10 01:22:02,345
 CassandraDaemon.java (line 191) Exception in thread Thread[Streaming to /
 10.156.1.3:1,5,main]
  java.lang.RuntimeException: java.io.IOException: Broken pipe
 at com.google.common.base.Throwables.propagate(Throwables.java:160)
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)
 Caused by: java.io.IOException: Broken pipe
 at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
 at
 sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:420)
 at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:552)
 at
 org.apache.cassandra.streaming.compress.CompressedFileStreamTask.stream(CompressedFileStreamTask.java:93)
 at
 org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)


 During bootstrapping we notice a significant spike in CPU and latency
 across the board on the ring (CPU 50-85% and write latencies 60ms -
 250ms). It seems likely that this persistent high load led to the hiccup
 that caused the gossiper to see the streaming node as briefly down.

 What is the proper way to recover from this? The original estimate was
 almost 24 hours to stream all the data required to bootstrap this single
 node (streaming set to unlimited) and this occurred 6 hours into the
 bootstrap. With such high load from streaming it seems that simply
 restarting will inevitably hit this problem again.


 Cheers,

 Mike

 --

   Mike Heffner m...@librato.com
   Librato, Inc.

Re: Object mapper for CQL

2014-06-08 Thread Colin

I would check out spring Cassandra-most of the java drivers out there for 
Cassandra offer very little over the new 2. driver from Datastax.  Or just use 
the java driver 2. as is.

There's even a query builder light fluent DSL if you don't like cql.  Based 
upon your use case description so far, I don't think you need to get too funky 
with your data access layer.

Whatever you do, make sure the driver you use supports CQL 3 and the native 
protocol.  Thrift, like BOP, will most likely go away at some point in the 
future.
--
Colin
320-221-9531


 On Jun 8, 2014, at 8:58 PM, Johan Edstrom seij...@gmail.com wrote:
 
 Kevin, 
 
 We are about to release 2.0 of https://github.com/savoirtech/hecate
 It is an ASL licensed library that started with Jeff Genender writing a Pojo
 library in Hector for a project we did for Ecuador (Essentially all of 
 Ecuador uses this).
 I extended this with Pojo Graph stuff like Collections and Composite key 
 indexing.
 
 James Carman then took this a bit further in Cassidy with some new concepts.
 I then a while back decided to bite the bullet and my hatred of CQL and just 
 write 
 the same thing, it started out with a very reflection and somewhat clunky 
 interface, 
 James decided to re-write this and incorporate the learnings from Cassidy.
 
 - Jeff, James and I all work together. This library is already in use and has 
 been 
 in use under 30 mil account circumstances as well as quite decent loads.
 
 What you see in trunk now under hecate-cql3 is what'll go out as 2.0, it is a 
 new API, 
 we support single pojo and Object graph, column modifiers, indexer and 
 everything
 else we could think of in a library that isn't ORM but maps data to C*.
 
 What will be out in I think 2.0.2 is an external indexer very much like Titan 
 and 
 possibly some more real graph (vertices) stuff. We are also looking at an 
 SchemaIdentifier
 so that we can get back to working with dynamic columns at a decent 
 conceptual speed :)
 
 /je
 
 On Jun 8, 2014, at 2:46 AM, DuyHai Doan doanduy...@gmail.com wrote:
 
 You can have a look at Achilles, it's using the Java Driver underneath : 
 https://github.com/doanduyhai/Achilles
 
 Le 8 juin 2014 04:24, Kevin Burton bur...@spinn3r.com a écrit :
 Looks like the java-driver is working on an object mapper:
 
 More modules including a simple object mapper will come shortly.
 But of course I need one now … 
 I'm curious what others are doing here.  
 
 I don't want to pass around Row objects in my code if I can avoid it.. 
 Ideally I would just run a query and get back a POJO.  
 
 Another issue is how are these POJOs generated.  Are they generated from the 
 schema?  is the schema generated from the POJOs ?  From a side file?  
 
 And granted, there are existing ORMs out there but I don't think any support 
 CQL.
 
 -- 
 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: Object mapper for CQL

2014-06-08 Thread Colin

I wasn't responding as a Datastax employee.

I have used hector, Achilles and a few others as well.  The .net drivers used 
to have an edge, but that is evaporating as well.

I have also built my own mapping layers.

But all if that was when the drivers from Datastax weren't there yet.

Yes, I work for Datastax.  I also speak at meetups, and contribute to the 
community.  

Datastax doesn't  charge for the drivers by the way.

I have seen folks use third party drivers and end up paying for it down the 
road. 

If you're going to consider using a community driver, then I would recommend 
something that wraps the Datastax drivers, like netflix does.

All I am saying is that sometimes, people make using Casaandra more complex 
than it needs to be and end up introducing a lot of new tech in their initial 
adoption-this increases the risk of the project.

Also, I wouldn't use anything built on thrift.  Datastax has a growing driver 
team, a growing focus on testing and certification, and if you end up wanting 
support for your project and are using an unsupported driver, it can make your 
life more difficult.

In response to how quickly  responded, I often try to provide assistance out 
here-I don't get paid for it, and it's not part of my job. Having close to 5 
years of production experience with Cassandra means that I have made all the 
mistakes out there and probably invented a few of my own.

I have watched a lot if the questions Kevin has asked-his project is ambitious 
for a first dip into Cassandra, I want to see him succeed, and have given him 
the same advice I give our customers.
--
Colin
320-221-9531


 On Jun 8, 2014, at 9:43 PM, Jeff Genender jgenen...@apache.org wrote:
 
 Comments in line...
 
 On Jun 8, 2014, at 8:05 PM, Colin colpcl...@gmail.com wrote:
 
 I would check out spring Cassandra-most of the java drivers out there for 
 Cassandra offer very little over the new 2. driver from Datastax.  Or just 
 use the java driver 2. as is.
 
 Interesting… answer came within 7 minutes… from a vendor (Datastax employee)… 
 and terribly opinionated without data to back up… I’m just sayin… ;-)
 
 Colin… did you even look at the driver referenced by Johan?  If so, thats 
 certainly is the fastest code review and driver test I have ever seen. ;-)
 
 Perhaps a bit more kindness may be more appropriate?  Not a great way to 
 build contributions from the community...
 
 SNIP
 
 Whatever you do, make sure the driver you use supports CQL 3 and the native 
 protocol.  Thrift, like BOP, will most likely go away at some point in the 
 future.
 
 Read what Johan stated… “hecate-cql3” — CQL 3
 
 I think a nice look at what was produced may be a good thing for the 
 community and maybe even Datastax may think its kinda cool?
 
 Jeff Genender
 Apache Member
 http://www.apache.org
 
 
 --
 Colin
 320-221-9531
 
 
 On Jun 8, 2014, at 8:58 PM, Johan Edstrom seij...@gmail.com wrote:
 
 Kevin, 
 
 We are about to release 2.0 of https://github.com/savoirtech/hecate
 It is an ASL licensed library that started with Jeff Genender writing a Pojo
 library in Hector for a project we did for Ecuador (Essentially all of 
 Ecuador uses this).
 I extended this with Pojo Graph stuff like Collections and Composite key 
 indexing.
 
 James Carman then took this a bit further in Cassidy with some new concepts.
 I then a while back decided to bite the bullet and my hatred of CQL and 
 just write 
 the same thing, it started out with a very reflection and somewhat clunky 
 interface, 
 James decided to re-write this and incorporate the learnings from Cassidy.
 
 - Jeff, James and I all work together. This library is already in use and 
 has been 
 in use under 30 mil account circumstances as well as quite decent loads.
 
 What you see in trunk now under hecate-cql3 is what'll go out as 2.0, it is 
 a new API, 
 we support single pojo and Object graph, column modifiers, indexer and 
 everything
 else we could think of in a library that isn't ORM but maps data to C*.
 
 What will be out in I think 2.0.2 is an external indexer very much like 
 Titan and 
 possibly some more real graph (vertices) stuff. We are also looking at an 
 SchemaIdentifier
 so that we can get back to working with dynamic columns at a decent 
 conceptual speed :)
 
 /je
 
 On Jun 8, 2014, at 2:46 AM, DuyHai Doan doanduy...@gmail.com wrote:
 
 You can have a look at Achilles, it's using the Java Driver underneath : 
 https://github.com/doanduyhai/Achilles
 
 Le 8 juin 2014 04:24, Kevin Burton bur...@spinn3r.com a écrit :
 Looks like the java-driver is working on an object mapper:
 
 More modules including a simple object mapper will come shortly.
 But of course I need one now … 
 I'm curious what others are doing here.  
 
 I don't want to pass around Row objects in my code if I can avoid it.. 
 Ideally I would just run a query and get back a POJO.  
 
 Another issue is how are these POJOs generated.  Are they generated from 
 the schema?  is the schema generated from

Re: Object mapper for CQL

2014-06-08 Thread Colin

Sounds like you've done some great work.  But I still think it's a good idea 
for people new to Cassandra establish a base line so that they have something 
to compare other approaches against.

It sounds like we potentially have different views in this regard, but are 
still interested in the same thing-helping people be successful using Casaandra.

--
Colin
320-221-9531


 On Jun 8, 2014, at 10:24 PM, Johan Edstrom seij...@gmail.com wrote:
 
 On a second reply I'll provide some docs.
 
 We looked at Astynax (Yeah I didn't like the refactor)
 We looked at spring - Are you fucking kidding me?
 We have done quite a bit of work in the ORM arena.
 
 * I passionately hate the idea of CQL.  *
 
 So - I told myself, I need to make this work so I never ever
 have to work with that. See, I liked Big Table, I loved the idea of modeling 
 without 
 constrained and  contrived relations. I was even more of a fan 
 combining analytics and adjoining vertices.
 
 That said, - Hecate-CQL3 does address all of the above, as well as 
 a Pojo / DAO cache, a Table Cache, a what was changed store.
 
 If you actually think you'll be writing enterprise code at speed using 
 a Rowset, sorry, you need a foam helmet.
 
 /je
 
 
 
 On Jun 8, 2014, at 9:05 PM, Colin colpcl...@gmail.com wrote:
 
 I wasn't responding as a Datastax employee.
 
 I have used hector, Achilles and a few others as well.  The .net drivers 
 used to have an edge, but that is evaporating as well.
 
 I have also built my own mapping layers.
 
 But all if that was when the drivers from Datastax weren't there yet.
 
 Yes, I work for Datastax.  I also speak at meetups, and contribute to the 
 community.  
 
 Datastax doesn't  charge for the drivers by the way.
 
 I have seen folks use third party drivers and end up paying for it down the 
 road. 
 
 If you're going to consider using a community driver, then I would recommend 
 something that wraps the Datastax drivers, like netflix does.
 
 All I am saying is that sometimes, people make using Casaandra more complex 
 than it needs to be and end up introducing a lot of new tech in their 
 initial adoption-this increases the risk of the project.
 
 Also, I wouldn't use anything built on thrift.  Datastax has a growing 
 driver team, a growing focus on testing and certification, and if you end up 
 wanting support for your project and are using an unsupported driver, it can 
 make your life more difficult.
 
 In response to how quickly  responded, I often try to provide assistance out 
 here-I don't get paid for it, and it's not part of my job. Having close to 5 
 years of production experience with Cassandra means that I have made all the 
 mistakes out there and probably invented a few of my own.
 
 I have watched a lot if the questions Kevin has asked-his project is 
 ambitious for a first dip into Cassandra, I want to see him succeed, and 
 have given him the same advice I give our customers.
 --
 Colin
 320-221-9531
 
 
 On Jun 8, 2014, at 9:43 PM, Jeff Genender jgenen...@apache.org wrote:
 
 Comments in line...
 
 On Jun 8, 2014, at 8:05 PM, Colin colpcl...@gmail.com wrote:
 
 I would check out spring Cassandra-most of the java drivers out there for 
 Cassandra offer very little over the new 2. driver from Datastax.  Or just 
 use the java driver 2. as is.
 
 Interesting… answer came within 7 minutes… from a vendor (Datastax 
 employee)… and terribly opinionated without data to back up… I’m just 
 sayin… ;-)
 
 Colin… did you even look at the driver referenced by Johan?  If so, thats 
 certainly is the fastest code review and driver test I have ever seen. ;-)
 
 Perhaps a bit more kindness may be more appropriate?  Not a great way to 
 build contributions from the community...
 
 SNIP
 
 Whatever you do, make sure the driver you use supports CQL 3 and the 
 native protocol.  Thrift, like BOP, will most likely go away at some point 
 in the future.
 
 Read what Johan stated… “hecate-cql3” — CQL 3
 
 I think a nice look at what was produced may be a good thing for the 
 community and maybe even Datastax may think its kinda cool?
 
 Jeff Genender
 Apache Member
 http://www.apache.org
 
 
 --
 Colin
 320-221-9531
 
 
 On Jun 8, 2014, at 8:58 PM, Johan Edstrom seij...@gmail.com wrote:
 
 Kevin, 
 
 We are about to release 2.0 of https://github.com/savoirtech/hecate
 It is an ASL licensed library that started with Jeff Genender writing a 
 Pojo
 library in Hector for a project we did for Ecuador (Essentially all of 
 Ecuador uses this).
 I extended this with Pojo Graph stuff like Collections and Composite key 
 indexing.
 
 James Carman then took this a bit further in Cassidy with some new 
 concepts.
 I then a while back decided to bite the bullet and my hatred of CQL and 
 just write 
 the same thing, it started out with a very reflection and somewhat clunky 
 interface, 
 James decided to re-write this and incorporate the learnings from Cassidy.
 
 - Jeff, James and I all work together. This library is already

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin

I believe Byteorderedpartitioner is being deprecated and for good reason.  I 
would look at what you could achieve by using wide rows and murmur3partitioner.



--
Colin
320-221-9531


 On Jun 6, 2014, at 5:27 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 We have the requirement to have clients read from our tables while they're 
 being written.
 
 Basically, any write that we make to cassandra needs to be sent out over the 
 Internet to our customers.
 
 We also need them to resume so if they go offline, they can just pick up 
 where they left off.
 
 They need to do this in parallel, so if we have 20 cassandra nodes, they can 
 have 20 readers each efficiently (and without coordination) reading from our 
 tables.
 
 Here's how we're planning on doing it.
 
 We're going to use the ByteOrderedPartitioner .
 
 I'm writing with a primary key of the timestamp, however, in practice, this 
 would yield hotspots.
 
 (I'm also aware that time isn't a very good pk in a distribute system as I 
 can easily have a collision so we're going to use a scheme similar to a uuid 
 to make it unique per writer).
 
 One node would take all the load, followed by the next node, etc.
 
 So my plan to stop this is to prefix a slice ID to the timestamp.  This way 
 each piece of content has a unique ID, but the prefix will place it on a node.
 
 The slide ID is just a byte… so this means there are 255 buckets in which I 
 can place data.  
 
 This means I can have clients each start with a slice, and a timestamp, and 
 page through the data with tokens.
 
 This way I can have a client reading with 255 threads from 255 regions in the 
 cluster, in parallel, without any hot spots.
 
 Thoughts on this strategy?  
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin Clark

It's an anti-pattern and there are better ways to do this.

I have implemented the paging algorithm you've described using wide rows
and bucketing.  This approach is a more efficient utilization of
Cassandra's built in wholesome goodness.

Also, I wouldn't let any number of clients (huge) connect directly the
cluster to do this-put some type of app server in between to handle the
comm's and fan out.  You'll get better utilization of resources and less
overhead in addition to flexibility of which data center you're utilizing
to serve requests.



--
Colin
320-221-9531


On Jun 7, 2014, at 12:28 PM, Kevin Burton bur...@spinn3r.com wrote:

I just checked the source and in 2.1.0 it's not deprecated.

So it *might* be *being* deprecated but I haven't seen anything stating
that.


On Sat, Jun 7, 2014 at 8:03 AM, Colin colpcl...@gmail.com wrote:

 I believe Byteorderedpartitioner is being deprecated and for good reason.
  I would look at what you could achieve by using wide rows and
 murmur3partitioner.



 --
 Colin
 320-221-9531


 On Jun 6, 2014, at 5:27 PM, Kevin Burton bur...@spinn3r.com wrote:

 We have the requirement to have clients read from our tables while they're
 being written.

 Basically, any write that we make to cassandra needs to be sent out over
 the Internet to our customers.

 We also need them to resume so if they go offline, they can just pick up
 where they left off.

 They need to do this in parallel, so if we have 20 cassandra nodes, they
 can have 20 readers each efficiently (and without coordination) reading
 from our tables.

 Here's how we're planning on doing it.

 We're going to use the ByteOrderedPartitioner .

 I'm writing with a primary key of the timestamp, however, in practice,
 this would yield hotspots.

 (I'm also aware that time isn't a very good pk in a distribute system as I
 can easily have a collision so we're going to use a scheme similar to a
 uuid to make it unique per writer).

 One node would take all the load, followed by the next node, etc.

 So my plan to stop this is to prefix a slice ID to the timestamp.  This
 way each piece of content has a unique ID, but the prefix will place it on
 a node.

 The slide ID is just a byte… so this means there are 255 buckets in which
 I can place data.

 This means I can have clients each start with a slice, and a timestamp,
 and page through the data with tokens.

 This way I can have a client reading with 255 threads from 255 regions in
 the cluster, in parallel, without any hot spots.

 Thoughts on this strategy?

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin

Maybe it makes sense to describe what you're trying to accomplish in more 
detail.

A common bucketing approach is along the lines of year, month, day, hour, 
minute, etc and then use a timeuuid as a cluster column.  

Depending upon the semantics of the transport protocol you plan on utilizing, 
either the client code keep track of pagination, or the app server could, if 
you utilized some type of request/reply/ack flow.  You could keep sequence 
numbers for each client, and begin streaming data to them or allowing query 
upon reconnect, etc.

But again, more details of the use case might prove useful.

--
Colin
320-221-9531


 On Jun 7, 2014, at 1:53 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 Another way around this is to have a separate table storing the number of 
 buckets.
 
 This way if you have too few buckets, you can just increase them in the 
 future.
 
 Of course, the older data will still have too few buckets :-(
 
 
 On Sat, Jun 7, 2014 at 11:09 AM, Kevin Burton bur...@spinn3r.com wrote:
 
 On Sat, Jun 7, 2014 at 10:41 AM, Colin Clark co...@clark.ws wrote:
 It's an anti-pattern and there are better ways to do this.
 
 Entirely possible :)
 
 It would be nice to have a document with a bunch of common cassandra design 
 patterns.
 
 I've been trying to track down a pattern for this and a lot of this is 
 pieced in different places an individual blogs posts so one has to reverse 
 engineer it.
  
 I have implemented the paging algorithm you've described using wide rows 
 and bucketing.  This approach is a more efficient utilization of 
 Cassandra's built in wholesome goodness.
 
 So.. I assume the general pattern is to:
 
 create a bucket.. you create like 2^16 buckets, this is your partition key.  
  
 
 Then you place a timestamp next to the bucket in a primary key.
 
 So essentially:
 
 primary key( bucket, timestamp )… 
 
 .. so to read from this buck you essentially execute: 
 
 select * from foo where bucket = 100 and timestamp  12345790 limit 1;
  
 
 Also, I wouldn't let any number of clients (huge) connect directly the 
 cluster to do this-put some type of app server in between to handle the 
 comm's and fan out.  You'll get better utilization of resources and less 
 overhead in addition to flexibility of which data center you're utilizing 
 to serve requests. 
 
 this is interesting… since the partition is the bucket, you could make some 
 poor decisions based on the number of buckets.
 
 For example, 
 
 if you use 2^64 buckets, the number of items in each bucket is going to be 
 rather small.  So you're going to have tons of queries each fetching 0-1 row 
 (if you have a small amount of data).
 
 But if you use very FEW buckets.. say 5, but you have a cluster of 1000 
 nodes, then you will have 5 of these buckets on 5 nodes, and the rest of the 
 nodes without any data.
 
 Hm..
 
 the byte ordered partitioner solves this problem because I can just pick a 
 fixed number of buckets and then this is the primary key prefix and the data 
 in a bucket can be split up across machines based on any arbitrary split 
 even in the middle of a 'bucket' …
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin

The add seconds to the bucket.  Also, the data will get cached-it's not going 
to hit disk on every read.

Look at the key cache settings on the table.  Also, in 2.1 you have even more 
control over caching.

--
Colin
320-221-9531


 On Jun 7, 2014, at 4:30 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 
 On Sat, Jun 7, 2014 at 1:34 PM, Colin colpcl...@gmail.com wrote:
 Maybe it makes sense to describe what you're trying to accomplish in more 
 detail.
 
 Essentially , I'm appending writes of recent data by our crawler and sending 
 that data to our customers.
  
 They need to sync to up to date writes…we need to get them writes within 
 seconds. 
 
 A common bucketing approach is along the lines of year, month, day, hour, 
 minute, etc and then use a timeuuid as a cluster column.  
 
 I mean that is acceptable.. but that means for that 1 minute interval, all 
 writes are going to that one node (and its replicas)
 
 So that means the total cluster throughput is bottlenecked on the max disk 
 throughput.
 
 Same thing for reads… unless our customers are lagged, they are all going to 
 stampede and ALL of them are going to read data from one node, in a one 
 minute timeframe.
 
 That's no fun..  that will easily DoS our cluster.
  
 Depending upon the semantics of the transport protocol you plan on 
 utilizing, either the client code keep track of pagination, or the app 
 server could, if you utilized some type of request/reply/ack flow.  You 
 could keep sequence numbers for each client, and begin streaming data to 
 them or allowing query upon reconnect, etc.
 
 But again, more details of the use case might prove useful.
 
 I think if we were to just 100 buckets it would probably work just fine.  
 We're probably not going to be more than 100 nodes in the next year and if we 
 are that's still reasonable performance.  
 
 I mean if each box has a 400GB SSD that's 40TB of VERY fast data. 
 
 Kevin
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin Clark

No, you're not-the partition key will get distributed across the cluster if
you're using random or murmur.  You could also ensure that by adding
another column, like source to ensure distribution. (Add the seconds to the
partition key, not the clustering columns)

I can almost guarantee that if you put too much thought into working
against what Cassandra offers out of the box, that it will bite you later.

In fact, the use case that you're describing may best be served by a
queuing mechanism, and using Cassandra only for the underlying store.

I used this exact same approach in a use case that involved writing over a
million events/second to a cluster with no problems.  Initially, I thought
ordered partitioner was the way to go too.  And I used separate processes
to aggregate, conflate, and handle distribution to clients.

Just my two cents, but I also spend the majority of my days helping people
utilize Cassandra correctly, and rescuing those that haven't.

:)

--
Colin
320-221-9531


On Jun 7, 2014, at 6:53 PM, Kevin Burton bur...@spinn3r.com wrote:

well you could add milliseconds, at best you're still bottlenecking most of
your writes one one box.. maybe 2-3 if there are ones that are lagging.

Anyway.. I think using 100 buckets is probably fine..

Kevin


On Sat, Jun 7, 2014 at 2:45 PM, Colin colpcl...@gmail.com wrote:

 The add seconds to the bucket.  Also, the data will get cached-it's not
 going to hit disk on every read.

 Look at the key cache settings on the table.  Also, in 2.1 you have even
 more control over caching.

 --
 Colin
 320-221-9531


 On Jun 7, 2014, at 4:30 PM, Kevin Burton bur...@spinn3r.com wrote:


 On Sat, Jun 7, 2014 at 1:34 PM, Colin colpcl...@gmail.com wrote:

 Maybe it makes sense to describe what you're trying to accomplish in more
 detail.


 Essentially , I'm appending writes of recent data by our crawler and
 sending that data to our customers.

 They need to sync to up to date writes…we need to get them writes within
 seconds.

 A common bucketing approach is along the lines of year, month, day, hour,
 minute, etc and then use a timeuuid as a cluster column.


 I mean that is acceptable.. but that means for that 1 minute interval, all
 writes are going to that one node (and its replicas)

 So that means the total cluster throughput is bottlenecked on the max disk
 throughput.

 Same thing for reads… unless our customers are lagged, they are all going
 to stampede and ALL of them are going to read data from one node, in a one
 minute timeframe.

 That's no fun..  that will easily DoS our cluster.


 Depending upon the semantics of the transport protocol you plan on
 utilizing, either the client code keep track of pagination, or the app
 server could, if you utilized some type of request/reply/ack flow.  You
 could keep sequence numbers for each client, and begin streaming data to
 them or allowing query upon reconnect, etc.

 But again, more details of the use case might prove useful.


 I think if we were to just 100 buckets it would probably work just fine.
  We're probably not going to be more than 100 nodes in the next year and if
 we are that's still reasonable performance.

 I mean if each box has a 400GB SSD that's 40TB of VERY fast data.

 Kevin

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin Clark

Not if you add another column to the partition key; source for example.

I would really try to stay away from the ordered partitioner if at all
possible.

What ingestion rates are you expecting, in size and speed.

--
Colin
320-221-9531


On Jun 7, 2014, at 9:05 PM, Kevin Burton bur...@spinn3r.com wrote:


Thanks for the feedback on this btw.. .it's helpful.  My notes below.

On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark co...@clark.ws wrote:

 No, you're not-the partition key will get distributed across the cluster
 if you're using random or murmur.


Yes… I'm aware.  But in practice this is how it will work…

If we create bucket b0, that will get hashed to h0…

So say I have 50 machines performing writes, they are all on the same time
thanks to ntpd, so they all compute b0 for the current bucket based on the
time.

That gets hashed to h0…

If h0 is hosted on node0 … then all writes go to node zero for that 1
second interval.

So all my writes are bottlenecking on one node.  That node is *changing*
over time… but they're not being dispatched in parallel over N nodes.  At
most writes will only ever reach 1 node a time.



 You could also ensure that by adding another column, like source to ensure
 distribution. (Add the seconds to the partition key, not the clustering
 columns)

 I can almost guarantee that if you put too much thought into working
 against what Cassandra offers out of the box, that it will bite you later.


Sure.. I'm trying to avoid the 'bite you later' issues. More so because I'm
sure there are Cassandra gotchas to worry about.  Everything has them.
 Just trying to avoid the land mines :-P


 In fact, the use case that you're describing may best be served by a
 queuing mechanism, and using Cassandra only for the underlying store.


Yes… that's what I'm doing.  We're using apollo to fan out the queue, but
the writes go back into cassandra and needs to be read out sequentially.



 I used this exact same approach in a use case that involved writing over a
 million events/second to a cluster with no problems.  Initially, I thought
 ordered partitioner was the way to go too.  And I used separate processes
 to aggregate, conflate, and handle distribution to clients.



Yes. I think using 100 buckets will work for now.  Plus I don't have to
change the partitioner on our existing cluster and I'm lazy :)



 Just my two cents, but I also spend the majority of my days helping people
 utilize Cassandra correctly, and rescuing those that haven't.


Definitely appreciate the feedback!  Thanks!

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin Clark

With 100 nodes, that ingestion rate is actually quite low and I don't think
you'd need another column in the partition key.

You seem to be set in your current direction.  Let us know how it works out.

--
Colin
320-221-9531


On Jun 7, 2014, at 9:18 PM, Kevin Burton bur...@spinn3r.com wrote:

What's 'source' ? You mean like the URL?

If source too random it's going to yield too many buckets.

Ingestion rates are fairly high but not insane.  About 4M inserts per
hour.. from 5-10GB…


On Sat, Jun 7, 2014 at 7:13 PM, Colin Clark co...@clark.ws wrote:

 Not if you add another column to the partition key; source for example.

 I would really try to stay away from the ordered partitioner if at all
 possible.

 What ingestion rates are you expecting, in size and speed.

 --
 Colin
 320-221-9531


 On Jun 7, 2014, at 9:05 PM, Kevin Burton bur...@spinn3r.com wrote:


 Thanks for the feedback on this btw.. .it's helpful.  My notes below.

 On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark co...@clark.ws wrote:

 No, you're not-the partition key will get distributed across the cluster
 if you're using random or murmur.


 Yes… I'm aware.  But in practice this is how it will work…

 If we create bucket b0, that will get hashed to h0…

 So say I have 50 machines performing writes, they are all on the same time
 thanks to ntpd, so they all compute b0 for the current bucket based on the
 time.

 That gets hashed to h0…

 If h0 is hosted on node0 … then all writes go to node zero for that 1
 second interval.

 So all my writes are bottlenecking on one node.  That node is *changing*
 over time… but they're not being dispatched in parallel over N nodes.  At
 most writes will only ever reach 1 node a time.



 You could also ensure that by adding another column, like source to
 ensure distribution. (Add the seconds to the partition key, not the
 clustering columns)

 I can almost guarantee that if you put too much thought into working
 against what Cassandra offers out of the box, that it will bite you later.


 Sure.. I'm trying to avoid the 'bite you later' issues. More so because
 I'm sure there are Cassandra gotchas to worry about.  Everything has them.
  Just trying to avoid the land mines :-P


 In fact, the use case that you're describing may best be served by a
 queuing mechanism, and using Cassandra only for the underlying store.


 Yes… that's what I'm doing.  We're using apollo to fan out the queue, but
 the writes go back into cassandra and needs to be read out sequentially.



 I used this exact same approach in a use case that involved writing over
 a million events/second to a cluster with no problems.  Initially, I
 thought ordered partitioner was the way to go too.  And I used separate
 processes to aggregate, conflate, and handle distribution to clients.



 Yes. I think using 100 buckets will work for now.  Plus I don't have to
 change the partitioner on our existing cluster and I'm lazy :)



 Just my two cents, but I also spend the majority of my days helping
 people utilize Cassandra correctly, and rescuing those that haven't.


 Definitely appreciate the feedback!  Thanks!

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin

To have any redundancy in the system, start with at least 3 nodes and a 
replication factor of 3.

Try to have at least 8 cores, 32 gig ram, and separate disks for log and data.

Will you be replicating data across data centers?

--
Colin
320-221-9531


 On Jun 7, 2014, at 9:40 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 Oh.. To start with we're going to use from 2-10 nodes.. 
 
 I think we're going to take the original strategy and just to use 100 buckets 
 .. 0-99… then the timestamp under that..  I think it should be fine and won't 
 require an ordered partitioner. :)
 
 Thanks!
 
 
 On Sat, Jun 7, 2014 at 7:38 PM, Colin Clark co...@clark.ws wrote:
 With 100 nodes, that ingestion rate is actually quite low and I don't think 
 you'd need another column in the partition key.
 
 You seem to be set in your current direction.  Let us know how it works out.
 
 --
 Colin
 320-221-9531
 
 
 On Jun 7, 2014, at 9:18 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 What's 'source' ? You mean like the URL?
 
 If source too random it's going to yield too many buckets.  
 
 Ingestion rates are fairly high but not insane.  About 4M inserts per 
 hour.. from 5-10GB… 
 
 
 On Sat, Jun 7, 2014 at 7:13 PM, Colin Clark co...@clark.ws wrote:
 Not if you add another column to the partition key; source for example.  
 
 I would really try to stay away from the ordered partitioner if at all 
 possible.
 
 What ingestion rates are you expecting, in size and speed.
 
 --
 Colin
 320-221-9531
 
 
 On Jun 7, 2014, at 9:05 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 
 Thanks for the feedback on this btw.. .it's helpful.  My notes below.
 
 On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark co...@clark.ws wrote:
 No, you're not-the partition key will get distributed across the cluster 
 if you're using random or murmur.
 
 Yes… I'm aware.  But in practice this is how it will work…
 
 If we create bucket b0, that will get hashed to h0…
 
 So say I have 50 machines performing writes, they are all on the same 
 time thanks to ntpd, so they all compute b0 for the current bucket based 
 on the time.
 
 That gets hashed to h0…
 
 If h0 is hosted on node0 … then all writes go to node zero for that 1 
 second interval.
 
 So all my writes are bottlenecking on one node.  That node is *changing* 
 over time… but they're not being dispatched in parallel over N nodes.  At 
 most writes will only ever reach 1 node a time.
 
  
 You could also ensure that by adding another column, like source to 
 ensure distribution. (Add the seconds to the partition key, not the 
 clustering columns)
 
 I can almost guarantee that if you put too much thought into working 
 against what Cassandra offers out of the box, that it will bite you 
 later.
 
 Sure.. I'm trying to avoid the 'bite you later' issues. More so because 
 I'm sure there are Cassandra gotchas to worry about.  Everything has 
 them.  Just trying to avoid the land mines :-P
  
 In fact, the use case that you're describing may best be served by a 
 queuing mechanism, and using Cassandra only for the underlying store.
 
 Yes… that's what I'm doing.  We're using apollo to fan out the queue, but 
 the writes go back into cassandra and needs to be read out sequentially.
  
 
 I used this exact same approach in a use case that involved writing over 
 a million events/second to a cluster with no problems.  Initially, I 
 thought ordered partitioner was the way to go too.  And I used separate 
 processes to aggregate, conflate, and handle distribution to clients.
 
 
 Yes. I think using 100 buckets will work for now.  Plus I don't have to 
 change the partitioner on our existing cluster and I'm lazy :)
  
 
 Just my two cents, but I also spend the majority of my days helping 
 people utilize Cassandra correctly, and rescuing those that haven't.
 
 Definitely appreciate the feedback!  Thanks!
  
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin Clark

Write Consistency Level + Read Consistency Level  Replication Factor
ensure your reads will read consistently and having 3 nodes lets you
achieve redundancy in event of node failure.

So writing with CL of local quorum and reading with CL of local quorum
(2+23) with replication factor of 3 ensures reads and protection against
losing a node.

In event of losing a node, you can downgrade the CL automatically and then
also accept a little eventual consistency.


--
Colin
320-221-9531


On Jun 7, 2014, at 10:03 PM, James Campbell ja...@breachintelligence.com
wrote:

 This is a basic question, but having heard that advice before, I'm curious
about why the minimum recommended replication factor is three? Certainly
additional redundancy, and, I believe, a minimum threshold for paxos. Are
there other reasons?
On Jun 7, 2014 10:52 PM, Colin colpcl...@gmail.com wrote:
 To have any redundancy in the system, start with at least 3 nodes and a
replication factor of 3.

 Try to have at least 8 cores, 32 gig ram, and separate disks for log and
data.

 Will you be replicating data across data centers?

-- 
Colin
320-221-9531


On Jun 7, 2014, at 9:40 PM, Kevin Burton bur...@spinn3r.com wrote:

  Oh.. To start with we're going to use from 2-10 nodes..

 I think we're going to take the original strategy and just to use 100
buckets .. 0-99… then the timestamp under that..  I think it should be fine
and won't require an ordered partitioner. :)

 Thanks!


On Sat, Jun 7, 2014 at 7:38 PM, Colin Clark co...@clark.ws wrote:

  With 100 nodes, that ingestion rate is actually quite low and I don't
 think you'd need another column in the partition key.

  You seem to be set in your current direction.  Let us know how it works
 out.

 --
 Colin
 320-221-9531


 On Jun 7, 2014, at 9:18 PM, Kevin Burton bur...@spinn3r.com wrote:

   What's 'source' ? You mean like the URL?

  If source too random it's going to yield too many buckets.

  Ingestion rates are fairly high but not insane.  About 4M inserts per
 hour.. from 5-10GB…


 On Sat, Jun 7, 2014 at 7:13 PM, Colin Clark co...@clark.ws wrote:

  Not if you add another column to the partition key; source for example.


  I would really try to stay away from the ordered partitioner if at all
 possible.

  What ingestion rates are you expecting, in size and speed.

 --
 Colin
 320-221-9531


 On Jun 7, 2014, at 9:05 PM, Kevin Burton bur...@spinn3r.com wrote:


  Thanks for the feedback on this btw.. .it's helpful.  My notes below.

 On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark co...@clark.ws wrote:

  No, you're not-the partition key will get distributed across the
 cluster if you're using random or murmur.


  Yes… I'm aware.  But in practice this is how it will work…

  If we create bucket b0, that will get hashed to h0…

  So say I have 50 machines performing writes, they are all on the same
 time thanks to ntpd, so they all compute b0 for the current bucket based on
 the time.

  That gets hashed to h0…

  If h0 is hosted on node0 … then all writes go to node zero for that 1
 second interval.

  So all my writes are bottlenecking on one node.  That node is
 *changing* over time… but they're not being dispatched in parallel over N
 nodes.  At most writes will only ever reach 1 node a time.



  You could also ensure that by adding another column, like source to
 ensure distribution. (Add the seconds to the partition key, not the
 clustering columns)

  I can almost guarantee that if you put too much thought into working
 against what Cassandra offers out of the box, that it will bite you later.


  Sure.. I'm trying to avoid the 'bite you later' issues. More so because
 I'm sure there are Cassandra gotchas to worry about.  Everything has them.
  Just trying to avoid the land mines :-P


  In fact, the use case that you're describing may best be served by a
 queuing mechanism, and using Cassandra only for the underlying store.


  Yes… that's what I'm doing.  We're using apollo to fan out the queue,
 but the writes go back into cassandra and needs to be read out sequentially.



  I used this exact same approach in a use case that involved writing
 over a million events/second to a cluster with no problems.  Initially, I
 thought ordered partitioner was the way to go too.  And I used separate
 processes to aggregate, conflate, and handle distribution to clients.



  Yes. I think using 100 buckets will work for now.  Plus I don't have to
 change the partitioner on our existing cluster and I'm lazy :)



  Just my two cents, but I also spend the majority of my days helping
 people utilize Cassandra correctly, and rescuing those that haven't.


  Definitely appreciate the feedback!  Thanks!

  --

  Founder/CEO Spinn3r.com
  Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
  http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength

Re: Data model for streaming a large table in real time.

2014-06-07 Thread Colin Clark

You won't need containers - running one instance of Cassandra in that
configuration will hum along quite nicely and will make use of the cores
and memory.

I'd forget the raid anyway and just mount the disks separately (jbod)

--
Colin
320-221-9531


On Jun 7, 2014, at 10:02 PM, Kevin Burton bur...@spinn3r.com wrote:

Right now I'm just putting everything together as a proof of concept… so
just two cheap replicas for now.  And it's at 1/1th of the load.

If we lose data it's ok :)

I think our config will be 2-3x 400GB SSDs in RAID0 , 3 replicas, 16 cores,
probably 48-64GB of RAM each box.

Just one datacenter for now…

We're probably going to be migrating to using linux containers at some
point.  This way we can have like 16GB , one 400GB SSD, 4 cores for each
image.  And we can ditch the RAID which is nice. :)


On Sat, Jun 7, 2014 at 7:51 PM, Colin colpcl...@gmail.com wrote:

 To have any redundancy in the system, start with at least 3 nodes and a
 replication factor of 3.

 Try to have at least 8 cores, 32 gig ram, and separate disks for log and
 data.

 Will you be replicating data across data centers?

 --
 Colin
 320-221-9531


 On Jun 7, 2014, at 9:40 PM, Kevin Burton bur...@spinn3r.com wrote:

 Oh.. To start with we're going to use from 2-10 nodes..

 I think we're going to take the original strategy and just to use 100
 buckets .. 0-99… then the timestamp under that..  I think it should be fine
 and won't require an ordered partitioner. :)

 Thanks!


 On Sat, Jun 7, 2014 at 7:38 PM, Colin Clark co...@clark.ws wrote:

 With 100 nodes, that ingestion rate is actually quite low and I don't
 think you'd need another column in the partition key.

 You seem to be set in your current direction.  Let us know how it works
 out.

 --
 Colin
 320-221-9531


 On Jun 7, 2014, at 9:18 PM, Kevin Burton bur...@spinn3r.com wrote:

 What's 'source' ? You mean like the URL?

 If source too random it's going to yield too many buckets.

 Ingestion rates are fairly high but not insane.  About 4M inserts per
 hour.. from 5-10GB…


 On Sat, Jun 7, 2014 at 7:13 PM, Colin Clark co...@clark.ws wrote:

 Not if you add another column to the partition key; source for example.

 I would really try to stay away from the ordered partitioner if at all
 possible.

 What ingestion rates are you expecting, in size and speed.

 --
 Colin
 320-221-9531


 On Jun 7, 2014, at 9:05 PM, Kevin Burton bur...@spinn3r.com wrote:


 Thanks for the feedback on this btw.. .it's helpful.  My notes below.

 On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark co...@clark.ws wrote:

 No, you're not-the partition key will get distributed across the
 cluster if you're using random or murmur.


 Yes… I'm aware.  But in practice this is how it will work…

 If we create bucket b0, that will get hashed to h0…

 So say I have 50 machines performing writes, they are all on the same
 time thanks to ntpd, so they all compute b0 for the current bucket based on
 the time.

 That gets hashed to h0…

 If h0 is hosted on node0 … then all writes go to node zero for that 1
 second interval.

 So all my writes are bottlenecking on one node.  That node is *changing*
 over time… but they're not being dispatched in parallel over N nodes.  At
 most writes will only ever reach 1 node a time.



 You could also ensure that by adding another column, like source to
 ensure distribution. (Add the seconds to the partition key, not the
 clustering columns)

 I can almost guarantee that if you put too much thought into working
 against what Cassandra offers out of the box, that it will bite you later.


 Sure.. I'm trying to avoid the 'bite you later' issues. More so because
 I'm sure there are Cassandra gotchas to worry about.  Everything has them.
  Just trying to avoid the land mines :-P


 In fact, the use case that you're describing may best be served by a
 queuing mechanism, and using Cassandra only for the underlying store.


 Yes… that's what I'm doing.  We're using apollo to fan out the queue,
 but the writes go back into cassandra and needs to be read out sequentially.



 I used this exact same approach in a use case that involved writing
 over a million events/second to a cluster with no problems.  Initially, I
 thought ordered partitioner was the way to go too.  And I used separate
 processes to aggregate, conflate, and handle distribution to clients.



 Yes. I think using 100 buckets will work for now.  Plus I don't have to
 change the partitioner on our existing cluster and I'm lazy :)



 Just my two cents, but I also spend the majority of my days helping
 people utilize Cassandra correctly, and rescuing those that haven't.


 Definitely appreciate the feedback!  Thanks!

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Colin Clark

Is your version of Hector using native protocol or thrift?

--
Colin
+1 320 221 9531



On Mon, Jun 2, 2014 at 6:41 AM, Peter Lin wool...@gmail.com wrote:


 I'm happy to announce Concord has decided to open source our port of
 Hector to .Net.

 The project is hosted on google code
 https://code.google.com/p/nectar-client/

 I'm still adding code documentation and wiki pages. It has been tested
 against 1.1.x, 2.0.x

 thanks

 peter

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Colin Clark

Unless a cassandra driver is using the native protocol, it's going to have
a very short life going forward.

--
Colin
+1 320 221 9531



On Mon, Jun 2, 2014 at 7:10 AM, Peter Lin wool...@gmail.com wrote:


 it is using thrift. I've updated the project page to state that info.


 On Mon, Jun 2, 2014 at 8:08 AM, Colin Clark co...@clark.ws wrote:

 Is your version of Hector using native protocol or thrift?

  --
 Colin
 +1 320 221 9531



 On Mon, Jun 2, 2014 at 6:41 AM, Peter Lin wool...@gmail.com wrote:


 I'm happy to announce Concord has decided to open source our port of
 Hector to .Net.

 The project is hosted on google code
 https://code.google.com/p/nectar-client/

 I'm still adding code documentation and wiki pages. It has been tested
 against 1.1.x, 2.0.x

 thanks

 peter

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Colin Clark

Peter,

There's very little reason today to write your own Cassandra driver for
.net, java, or python.  Those firms that do are now starting to wrap those
drivers with any specific functionality they might require, like Netflix,
for example.  Have you looked at DataStax's .NET driver?

--
Colin
+1 320 221 9531



On Mon, Jun 2, 2014 at 7:38 AM, Peter Lin wool...@gmail.com wrote:


 thanks for the correction. Maybe it's just me, but I wish the
 implementation were also in apache's repo. It's not a big thing, but having
 multiple github forks to keep track of is a bit annoying. I'd rather spend
 time coding instead of screwing with git on windows.


 On Mon, Jun 2, 2014 at 8:29 AM, Benedict Elliott Smith 
 belliottsm...@datastax.com wrote:

 The native protocol specification has always been in the Apache Cassandra
 repository. The implementations are not.


 On 2 June 2014 13:25, Peter Lin wool...@gmail.com wrote:


 There's nothing preventing support for native protocol going forward. It
 was easier to go with thrift and I happen to like thirft. Native protocol
 is still relatively new, so I'm taking a wait and see approach.Is the
 native protocol specification and drivers still in DataStax's git?

 If it's going to be the standard protocol, then it really should be in
 apache's repo. That's my bias opinion.




 On Mon, Jun 2, 2014 at 8:16 AM, Colin Clark co...@clark.ws wrote:

 Unless a cassandra driver is using the native protocol, it's going to
 have a very short life going forward.

 --
 Colin
 +1 320 221 9531



 On Mon, Jun 2, 2014 at 7:10 AM, Peter Lin wool...@gmail.com wrote:


 it is using thrift. I've updated the project page to state that info.


 On Mon, Jun 2, 2014 at 8:08 AM, Colin Clark co...@clark.ws wrote:

 Is your version of Hector using native protocol or thrift?

  --
 Colin
 +1 320 221 9531



 On Mon, Jun 2, 2014 at 6:41 AM, Peter Lin wool...@gmail.com wrote:


 I'm happy to announce Concord has decided to open source our port of
 Hector to .Net.

 The project is hosted on google code
 https://code.google.com/p/nectar-client/

 I'm still adding code documentation and wiki pages. It has been
 tested against 1.1.x, 2.0.x

 thanks

 peter

Re: Batched statements two different cassandra clusters.

2014-06-01 Thread Colin

Set it up as one cluster with multiple datacenters and configure replication 
accordingly.

--
Colin Clark 
+1-320-221-9531
 

 On Jun 1, 2014, at 2:43 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 As far as I understand, this is impossible.
 
 There isn't a way to figure out which statement goes to which cluster.
 
 We are going to have two different clusters because the hardware config is 
 slightly different and so is our caching strategy.
 
 A plan B could be to write to both commits to something like JMS/Apollo and 
 then use transactional messages to verify that everything is written to both 
 places.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: Tune cache MB settings per table.

2014-06-01 Thread Colin

The OS should handle this really well as long as your on v3 linux kernel  

--
Colin Clark 
+1-320-221-9531
 

 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 It's possible to set caching to:
 
 all, keys_only, rows_only, or none
 
 .. for a given table.
 
 But we have one table which is MASSIVE and we only need the most recent 4-8 
 hours in memory.  
 
 Anything older than that can go to disk as the queries there are very rare.
 
 … but I don't think cassandra can do this (which is a shame).
 
 Another option is to partition our tables per hour… then tell the older 
 tables to cache 'none'… 
 
 I hate this option though.  A smarter mechanism would be to have a compaction 
 strategy that created an SSTable for every hour and then had custom caching 
 settings for that table.
 
 The additional upside for this is that TTLs would just drop the older data in 
 the compactor.. 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: Tune cache MB settings per table.

2014-06-01 Thread Colin

Have you been unable to achieve your SLA's using Cassandra out of the box so 
far?

Based upon my experience, trying to tune Cassandra before the app is done and 
without simulating real world load patterns, you might actually be doing 
yourself a disservice.

--
Colin
320-221-9531


 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 Not in our experience… We've been using fadvise don't need to purge pages 
 that aren't necessary any longer.
 
 Of course YMMV based on your usage.  I tend to like to control everything 
 explicitly instead of having magic.
 
 That's worked out very well for us in the past so it would be nice to still 
 have this on cassandra.
 
 
 On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote:
 The OS should handle this really well as long as your on v3 linux kernel 
  
 
 --
 Colin Clark 
 +1-320-221-9531
  
 
 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 It's possible to set caching to:
 
 all, keys_only, rows_only, or none
 
 .. for a given table.
 
 But we have one table which is MASSIVE and we only need the most recent 4-8 
 hours in memory.  
 
 Anything older than that can go to disk as the queries there are very rare.
 
 … but I don't think cassandra can do this (which is a shame).
 
 Another option is to partition our tables per hour… then tell the older 
 tables to cache 'none'… 
 
 I hate this option though.  A smarter mechanism would be to have a 
 compaction strategy that created an SSTable for every hour and then had 
 custom caching settings for that table.
 
 The additional upside for this is that TTLs would just drop the older data 
 in the compactor.. 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: Tune cache MB settings per table.

2014-06-01 Thread Colin

Your data model will most likely be the far most important component of your 
migration.  Get that right, and the rest is easy.

--
Colin Clark 
+1-320-221-9531
 

 On Jun 1, 2014, at 7:01 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 Good question. still migrating.. but we don't want to paint ourselves into a 
 corner.
 
 There's an interesting line between premature optimization and painting 
 yourself into a corner ;)
 
 Best to get it right in between both extremes.
 
 
 On Sun, Jun 1, 2014 at 4:30 PM, Colin colpcl...@gmail.com wrote:
 Have you been unable to achieve your SLA's using Cassandra out of the box so 
 far?
 
 Based upon my experience, trying to tune Cassandra before the app is done 
 and without simulating real world load patterns, you might actually be doing 
 yourself a disservice.
 
 --
 Colin
 320-221-9531
 
 
 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 Not in our experience… We've been using fadvise don't need to purge pages 
 that aren't necessary any longer.
 
 Of course YMMV based on your usage.  I tend to like to control everything 
 explicitly instead of having magic.
 
 That's worked out very well for us in the past so it would be nice to still 
 have this on cassandra.
 
 
 On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote:
 The OS should handle this really well as long as your on v3 linux 
 kernel  
 
 --
 Colin Clark 
 +1-320-221-9531
  
 
 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 It's possible to set caching to:
 
 all, keys_only, rows_only, or none
 
 .. for a given table.
 
 But we have one table which is MASSIVE and we only need the most recent 
 4-8 hours in memory.  
 
 Anything older than that can go to disk as the queries there are very 
 rare.
 
 … but I don't think cassandra can do this (which is a shame).
 
 Another option is to partition our tables per hour… then tell the older 
 tables to cache 'none'… 
 
 I hate this option though.  A smarter mechanism would be to have a 
 compaction strategy that created an SSTable for every hour and then had 
 custom caching settings for that table.
 
 The additional upside for this is that TTLs would just drop the older 
 data in the compactor.. 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: Avoiding High Cell Tombstone Count

2014-05-27 Thread Colin

Charlie,

I would be willing to help you out with your issues tomorrow afternoon, feel 
free to give me a call after 4m ET.  There are lots of people who store *and* 
update data with cassandra (at scale).

--
Colin Clark   | Solutions Architect
DataStax  |  www.datastax.com 
m | +1-320-221-9531
e  | colin.cl...@datastax.com


We power the big data applications that transform business.

More than 400 customers, including startups and twenty-five percent of the 
Fortune 100 rely on DataStax's massively scalable, flexible, fast and 
continuously available big data platform built on Apache Cassandra™. DataStax 
integrates in one cluster (thus requiring no ETL)  enterprise-ready Cassandra, 
Apache Hadoop™ for analytics and Apache Solr™ for search, across multiple data 
centers and in the cloud all while providing advanced enterprise security 
features that keep data safe.
 

 On May 27, 2014, at 4:16 PM, Robert Coli rc...@eventbrite.com wrote:
 
 On Sun, May 25, 2014 at 12:01 PM, Charlie Mason charlie@gmail.com 
 wrote:
 I have a table which has one column per user. It revives at lot of updates 
 to these columns through out the life time. They are always updates on a few 
 specific columns Firstly is Cassandra storing a Tombstone for each of these 
 old column values. 
 ...
 As you can see that's awful lot of tombstoned cells. That's after a full 
 compaction as well. Just so you are aware this table is updated using a 
 Paxos IF statement.
 
 If you do a lot of UPDATEs, perhaps a log structured database with immutable 
 datafiles from which row fragments are reconciled on read is not for you. 
 Especially if you have to use lightweight transactions to make your 
 application semantics work.
  
 Would I better off adding a time based key to the primary key. Then doing a 
 sepperate insert and then deleting the original. If I did the query with a 
 limit of one it should always find the first rows before hitting a 
 tombstone. Is that correct? 
 
 I have no idea what you're asking regarding a LIMIT of 1... in general 
 anything that scans over multiple partitions is bad. I'm pretty sure you 
 almost always want to use a design which allows you to use FIRST instead of 
 LIMIT for this reason.
 
 The overall form of your questions suggests you might be better off using the 
 right tool for the job, which may not be Cassandra.
 
 =Rob

Re: Cassandra CSV JSON uploader

2014-05-27 Thread Colin

Why wouldnt you use datastax enterprise in-memory option vs oracle coherence?


 

 On May 27, 2014, at 10:33 PM, Samir Faci sa...@esamir.com wrote:
 
 http://www.datastax.com/docs/1.0/references/sstable2json  might be what 
 you're looking for.  It's in the bin folder of your cassandra installation.
 
 Though I really doubt you'd want to just drop what is in Oracle into 
 cassandra.  SQL to NoSQL is rarely ever a 1 to 1 mapping.  
 
 
 On Tue, May 27, 2014 at 8:23 AM, Joyabrata Das 
 joy.luv.challen...@gmail.com wrote:
 Hi,
 
 Could someone please help on how to import data from Apache Cassandra to 
 Oracle Coherence.
 
 As per my understanding it could be done by sstable to JSON--JSON upload to 
 oracle coherence, however is there any code/tool/script to upload JSON to 
 Oracle Coherence.
 
 Thanks,
 Joy
 
 
 
 -- 
 Samir Faci
 *insert title*
 fortune | cowsay -f /usr/share/cows/tux.cow
 
 Sent from my non-iphone laptop.

Re: decommissioning a node

2014-05-25 Thread Colin Clark

Try this:

nodetool decomission host-id-of-node-to-decomission

UN means UP, NORMAL

--
Colin
+1 320 221 9531



On Sun, May 25, 2014 at 9:09 AM, Tim Dunphy bluethu...@gmail.com wrote:

 Also for information that may help diagnose this issue I am running
 cassandra 2.0.7

 I am also using these java options:

 [root@beta:/etc/alternatives/cassandrahome] #grep -i jvm_opts
 conf/cassandra-env.sh  | grep -v '#'
 JVM_OPTS=$JVM_OPTS -ea
 JVM_OPTS=$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 JVM_OPTS=$JVM_OPTS -XX:+UseThreadPriorities
 JVM_OPTS=$JVM_OPTS -XX:ThreadPriorityPolicy=42
 JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE}
 JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE}
 JVM_OPTS=$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError
 JVM_OPTS=$JVM_OPTS
 -XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof
 JVM_OPTS=$JVM_OPTS -Xss256k
 JVM_OPTS=$JVM_OPTS -XX:StringTableSize=103
 JVM_OPTS=$JVM_OPTS -XX:+UseParNewGC
 JVM_OPTS=$JVM_OPTS -XX:+UseConcMarkSweepGC
 JVM_OPTS=$JVM_OPTS -XX:+CMSParallelRemarkEnabled
 JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=8
 JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=1
 JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75
 JVM_OPTS=$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly
 JVM_OPTS=$JVM_OPTS -XX:+UseTLAB
 JVM_OPTS=$JVM_OPTS -XX:+UseCondCardMark
 JVM_OPTS=$JVM_OPTS -Djava.net.preferIPv4Stack=true
 JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT
 JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false
 JVM_OPTS=$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false
 JVM_OPTS=$JVM_OPTS $JVM_EXTRA_OPTS


 Still need to figure out why the node I want to decommission isn't
 listening on port 7199 and how I can actually decommission it.

 Thanks
 Tim


 On Sun, May 25, 2014 at 9:20 AM, Tim Dunphy bluethu...@gmail.com wrote:


 Hey all,

 I'm attempting to decommission a node I want to remove.

 First I get a status of the ring

 [root@beta-new:~] #nodetool status

 Datacenter: datacenter1

 ===

 Status=Up/Down

 |/ State=Normal/Leaving/Joining/Moving

 --  Address Load   Tokens  Owns   Host ID
   Rack

 UN  10.10.1.94  197.37 KB  256 49.4%
 fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1

 UN  10.10.1.18216.95 KB  256 50.6%
 f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack


 I see that the node I want to remove is UP. Tho I believe UN means up I
 don't know what it stands for.


 [root@beta-new:~] #nodetool -host  10.10.1.18 decommission

 Failed to connect to ' 10.10.1.18 : Connection timed out

 The connection to the node I want to decommission times out. :(

 I’m running this node from the seed node, and while I do see port 7199
 active and listening there, I do NOT see this port active and listening on
 the node that I want to decommission.


 Seed node:

 [root@beta-new:~] #lsof -i :7199

 COMMAND   PID USER   FD   TYPEDEVICE SIZE/OFF NODE NAME

 java15331 root   51u  IPv4 566368606  0t0  TCP *:7199 (LISTEN)


 [root@beta:/etc/alternatives/cassandrahome] #lsof -i :7199

 [root@beta:/etc/alternatives/cassandrahome] #


 However cassandra does seem to be running on the node I want to
 decommission in addition to it being shown as UN by nodetool status:


 [root@beta:/etc/alternatives/cassandrahome] #netstat -tulpn | grep -i
 listen | grep java

 tcp0  0 0.0.0.0:46755   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 10.10.1.18:9160   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 0.0.0.0:42990   0.0.0.0:*
 LISTEN  23039/java

 tcp0  0 10.10.1.18:8081   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 10.10.1.18:9042   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 10.10.1.18:7000   0.0.0.0:*
   LISTEN  23039/java

 tcp0  0 0.0.0.0:71980.0.0.0:*
 LISTEN  23039/java


 So why do you think my seed is listening on port 7199 but the node I want
 to get rid of is not? And how can I accomplish my goal of deleting the
 unwanted node?


 Thanks

 Tim



 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B




 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: Possible to Add multiple columns in one query ?

2014-05-25 Thread Colin

Try asynch updates, and collect the futures at 1,000 and play around from 
there.  

Also, in the real world, you'd want to use load balancing and token aware 
policies when connecting to the cluster.  This will actually bypass the 
coordinator and write directly to the correct nodes.

I will post a link to my github with an example when I get off the road

--
Colin
320-221-9531


 On May 25, 2014, at 1:56 PM, Jack Krupansky j...@basetechnology.com wrote:
 
 Typo: I presume “channelid” should be “tagid” for the partition key for your 
 table.
  
 Yes, BATCH statements are the way to go, but be careful not to make your 
 batches too large, otherwise you could lose performance when Cassandra is 
 relatively idle while the batch is slowly streaming in to the coordinator 
 node over the network. Better to break up a large batch into multiple 
 moderate size batches (exact size and number will vary and need testing to 
 deduce) that will transmit quicker and can be executed in parallel.
  
 I’m not sure Cassandra on a laptop would be the best measure of performance 
 for a real cluster, especially compared to a server with more CPU cores than 
 your laptop.
  
 And for a real cluster, rows with different partition keys can be sent to a 
 coordinator node that owns that partition key, which could be multiple nodes 
 for RF1.
  
 -- Jack Krupansky
  
 From: Mark Farnan
 Sent: Sunday, May 25, 2014 9:36 AM
 To: user@cassandra.apache.org
 Subject: Possible to Add multiple columns in one query ?
  
 I’m sure this is a  CQL 101 question, but.  
  
 Is it possible to add MULTIPLE   Rows/Columns  to a single Partition in a 
 single CQL 3  Query / Call. 
  
 Need:
 I’m trying to find the most efficient way to add multiple time series events 
 to a table in a single call.
 Whilst most time series data comes in sequentially, we have a case where it 
 is often loaded in bulk,  say sent  100,000 points for 50  channels/tags  at 
 one go.  (sometimes more), and this needs to be loaded as quickly and 
 efficiently as possible.
  
 Fairly standard Time-Series schema (this is for testing purposes only at this 
 point, and doesn’t represent final schemas)
  
 CREATE TABLE tag (
   tagid int,
   idx timestamp,
   value double,
   PRIMARY KEY (channelid, idx)
 ) WITH CLUSTERING ORDER BY (idx DESC);
  
  
 Currently I’m using Batch statements, but even that is not fast enough.
  
 Note: At this point I’m testing on a single node cluster on laptop, to 
 compare different versions.
  
 We are using DataStax C# 2.0 (beta) client. And Cassandra 2.0.7
  
 Regards
 Mark.

Re: Possible to Add multiple columns in one query ?

2014-05-25 Thread Colin

Also, make sure you're using prepared statements.

--
Colin
320-221-9531


 On May 25, 2014, at 1:56 PM, Jack Krupansky j...@basetechnology.com wrote:
 
 Typo: I presume “channelid” should be “tagid” for the partition key for your 
 table.
  
 Yes, BATCH statements are the way to go, but be careful not to make your 
 batches too large, otherwise you could lose performance when Cassandra is 
 relatively idle while the batch is slowly streaming in to the coordinator 
 node over the network. Better to break up a large batch into multiple 
 moderate size batches (exact size and number will vary and need testing to 
 deduce) that will transmit quicker and can be executed in parallel.
  
 I’m not sure Cassandra on a laptop would be the best measure of performance 
 for a real cluster, especially compared to a server with more CPU cores than 
 your laptop.
  
 And for a real cluster, rows with different partition keys can be sent to a 
 coordinator node that owns that partition key, which could be multiple nodes 
 for RF1.
  
 -- Jack Krupansky
  
 From: Mark Farnan
 Sent: Sunday, May 25, 2014 9:36 AM
 To: user@cassandra.apache.org
 Subject: Possible to Add multiple columns in one query ?
  
 I’m sure this is a  CQL 101 question, but.  
  
 Is it possible to add MULTIPLE   Rows/Columns  to a single Partition in a 
 single CQL 3  Query / Call. 
  
 Need:
 I’m trying to find the most efficient way to add multiple time series events 
 to a table in a single call.
 Whilst most time series data comes in sequentially, we have a case where it 
 is often loaded in bulk,  say sent  100,000 points for 50  channels/tags  at 
 one go.  (sometimes more), and this needs to be loaded as quickly and 
 efficiently as possible.
  
 Fairly standard Time-Series schema (this is for testing purposes only at this 
 point, and doesn’t represent final schemas)
  
 CREATE TABLE tag (
   tagid int,
   idx timestamp,
   value double,
   PRIMARY KEY (channelid, idx)
 ) WITH CLUSTERING ORDER BY (idx DESC);
  
  
 Currently I’m using Batch statements, but even that is not fast enough.
  
 Note: At this point I’m testing on a single node cluster on laptop, to 
 compare different versions.
  
 We are using DataStax C# 2.0 (beta) client. And Cassandra 2.0.7
  
 Regards
 Mark.

Re: Ordering of schema updates and data modifications

2014-05-18 Thread Colin

Hi Jan,

Try waiting a period of time, say 60 seconds, after modifying the schema so the 
changes propagate throughout the cluster.

Also, you could add a step to your automation where you verify the schema 
change by attempting to insert/delete from the schema with a higher consistency 
level to make sure a good number of nodes are in agreement before proceeding.

Does this make sense?

--
Colin Clark 
+1-320-221-9531
 

 On May 18, 2014, at 3:30 AM, Jan Algermissen jan.algermis...@nordsc.com 
 wrote:
 
 Hi,
 
 in our project, we apparently have a problem or misunderstanding of the 
 relationship between schema changes and data updates.
 
 One team is doing automated tests during build and deployment that executes 
 data migration tests on a development cluster. In those migrations there will 
 be schema changes (adding rows) and subsequent data insertions involving 
 these rows.
 
 It seems, there are unpredictable times when the update reaches the cluster 
 *before* the schema change, causing the tests to fail.
 
 What can we do to enforce the schema update to have sufficiently happened 
 before the modification is hitting the database?
 
 Alternatively, what do others do to handle schema migrations during 
 continuous delivery processes.
 
 Jan

Re: initial token crashes cassandra

2014-05-17 Thread Colin

You may have used the old random partitioner token generator.  Use the murmur 
partitioner token generator instead.

--
Colin
320-221-9531


 On May 17, 2014, at 1:15 PM, Tim Dunphy bluethu...@gmail.com wrote:
 
 Hey all,
 
  I've set my initial_token in cassandra 2.0.7 using a python script I found 
 at the datastax wiki. 
 
 I've set the value like this:
 
 initial_token: 85070591730234615865843651857942052864
 
 And cassandra crashes when I try to start it:
 
 [root@beta:/etc/alternatives/cassandrahome] #./bin/cassandra -f
  INFO 18:14:38,511 Logging initialized
  INFO 18:14:38,560 Loading settings from 
 file:/usr/local/apache-cassandra-2.0.7/conf/cassandra.yaml
  INFO 18:14:39,151 Data files directories: [/var/lib/cassandra/data]
  INFO 18:14:39,152 Commit log directory: /var/lib/cassandra/commitlog
  INFO 18:14:39,153 DiskAccessMode 'auto' determined to be mmap, 
 indexAccessMode is mmap
  INFO 18:14:39,153 disk_failure_policy is stop
  INFO 18:14:39,153 commit_failure_policy is stop
  INFO 18:14:39,161 Global memtable threshold is enabled at 251MB
  INFO 18:14:39,362 Not using multi-threaded compaction
 ERROR 18:14:39,365 Fatal configuration error
 org.apache.cassandra.exceptions.ConfigurationException: For input string: 
 85070591730234615865843651857942052864
 at 
 org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:178)
 at 
 org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:440)
 at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:111)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:153)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:471)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:560)
 For input string: 85070591730234615865843651857942052864
 Fatal configuration error; unable to start. See log for stacktrace.
 
 I really need to get replication going between 2 nodes. Can someone clue me 
 into why this may be crashing?
 
 Thanks!
 Tim
 
 -- 
 GPG me!!
 
 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: initial token crashes cassandra

2014-05-17 Thread Colin Clark

You probably generated the wrong token type.  Look for a murmur token
generator on the Datastax site.

--
Colin
320-221-9531


On May 17, 2014, at 7:00 PM, Tim Dunphy bluethu...@gmail.com wrote:

Hi and thanks for your response.

The puzzling thing is that yes I am using the murmur partition, yet I am
still getting the error I just told you guys about:

[root@beta:/etc/alternatives/cassandrahome] #grep -i partition
conf/cassandra.yaml | grep -v '#'
partitioner: org.apache.cassandra.dht.Murmur3Partitioner

Thanks
Tim


On Sat, May 17, 2014 at 3:23 PM, Colin colpcl...@gmail.com wrote:

 You may have used the old random partitioner token generator.  Use the
 murmur partitioner token generator instead.

 --
 Colin
 320-221-9531


 On May 17, 2014, at 1:15 PM, Tim Dunphy bluethu...@gmail.com wrote:

 Hey all,

  I've set my initial_token in cassandra 2.0.7 using a python script I
 found at the datastax wiki.

 I've set the value like this:

 initial_token: 85070591730234615865843651857942052864

 And cassandra crashes when I try to start it:

 [root@beta:/etc/alternatives/cassandrahome] #./bin/cassandra -f
  INFO 18:14:38,511 Logging initialized
  INFO 18:14:38,560 Loading settings from
 file:/usr/local/apache-cassandra-2.0.7/conf/cassandra.yaml
  INFO 18:14:39,151 Data files directories: [/var/lib/cassandra/data]
  INFO 18:14:39,152 Commit log directory: /var/lib/cassandra/commitlog
  INFO 18:14:39,153 DiskAccessMode 'auto' determined to be mmap,
 indexAccessMode is mmap
  INFO 18:14:39,153 disk_failure_policy is stop
  INFO 18:14:39,153 commit_failure_policy is stop
  INFO 18:14:39,161 Global memtable threshold is enabled at 251MB
  INFO 18:14:39,362 Not using multi-threaded compaction
 ERROR 18:14:39,365 Fatal configuration error
 org.apache.cassandra.exceptions.ConfigurationException: For input string:
 85070591730234615865843651857942052864
 at
 org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:178)
 at
 org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:440)
 at
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:111)
 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:153)
 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:471)
 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:560)
 For input string: 85070591730234615865843651857942052864
 Fatal configuration error; unable to start. See log for stacktrace.

 I really need to get replication going between 2 nodes. Can someone clue
 me into why this may be crashing?

 Thanks!
 Tim

 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B




-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B

Re: vcdiff/bmdiff , cassandra , and the ordered partitioner…

2014-05-17 Thread Colin

Cassandra offers compression out of the box.  Look into the options available 
upon table creation.

The use of orderedpartitioner is an anti-pattern 999/1000 times.  It creates 
hot spots - the use of wide rows can often accomplish the same result through 
the use of clustering columns.

--
Colin
320-221-9531


 On May 17, 2014, at 10:15 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 So  I see that Cassandra doesn't support bmdiff/vcdiff.
 
 Is this primarily because most people aren't using the ordered partitioner?
 
 bmdiff gets good compression by storing similar content next to each page on 
 disk.  So lots of HTML content would compress well.  
 
 but if everything is being stored at random locations, you wouldn't get that 
 bump in storage / compression reduction.
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: initial token crashes cassandra

2014-05-17 Thread Colin Clark

Looks like you may have put the token next to num-tokens property in the
yaml file for one node.  I would double check the yaml's to make sure the
tokens are setup correctly and that the ip addresses are associated with
the right entries as well.

Compare them to a fresh download if possible to see what you've changed.

--
Colin
320-221-9531


On May 17, 2014, at 10:29 PM, Tim Dunphy bluethu...@gmail.com wrote:

You probably generated the wrong token type.  Look for a murmur token
 generator on the Datastax site.

What Colin is saying is that the tool you used to create the token, is not
 creating tokens usable for the Murmur3Partitioner. That tool is probably
 generating tokens for the (original) RandomPartitioner, which has a
 different range.


Thanks guys for your input. And I apologize for reading  Colin's initial
response too quickly which lets me know that I was probably using the wrong
token generator for the wrong partition type. That of course was the case.
So what I've done is use this token generator form the datastax website:

python -c 'print [str(((2**64 / number_of_tokens) * i) - 2**63) for i
in range(number_of_tokens)]


That algorithm generated a token I could use to start Cassandra on my
second node.


However at this stage I have both nodes running and I believe their
gossiping if I understand what I see here correctly:


 INFO 02:44:13,823 No gossip backlog; proceeding


However I've setup web pages for each of the two web servers that are
running Cassandra. And it looks like the seed node with all the data
is rendering correctly. But the node that's downstream from the seed
node is not receiving any of its data despite the message that I've
just shown you.


And if I go to the seed node and do a describe keyspaces I see the
keyspace that drives the website listed. It's called 'joke_fire1'


cqlsh describe keyspaces;

system  joke_fire1  system_traces

And if I go to the node that's downstream from the seed node and run
the same command:


cqlsh describe keyspaces;

system  system_traces


I don't see the important keyspace that runs the site.


I have the seed node's IP listed in 'seeds' in the cassandra.yaml on
the downstream node. So I'm not really sure why its' not receiving the
seed's data. If there's some command I need to run to flush the system
or something like that.


And if I do a nodetool ring command on the first (seed) host I don't
see the IP of the downstream node listed:







[root@beta-new:~] #nodetool ring | head -10

Note: Ownership information does not include topology; for complete
information, specify a keyspace


Datacenter: datacenter1

==

Address RackStatus State   LoadOwns
Token


10.10.1.94  rack1   Up Normal  150.64 KB   100.00%
-9173731940639284976

10.10.1.94  rack1   Up Normal  150.64 KB   100.00%
-9070607847117718988

10.10.1.94  rack1   kUp Normal  150.64 KB   100.00%
 -9060190512633067546

10.10.1.94  rack1   Up Normal  150.64 KB   100.00%
-8935690644016753923


And if I look on the downstream node and run nodetool ring I see only
the IP of the downstream node and not the seed listed:









[root@beta:/var/lib/cassandra] #nodetool ring | head -15


Datacenter: datacenter1

==

Address  RackStatus State   LoadOwns
 Token


10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-9223372036854775808

10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-9151314442816847873

10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-9079256848778919937

10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-9007199254740992001

10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-8935141660703064065

10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-886308405136129

10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-8791026472627208193

10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-8718968878589280257

10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-8646911284551352321

10.10.1.98  rack1   Up Normal  91.06 KB99.99%
-8574853690513424385


Yet in my seeds entry in cassandra.yaml I have the correct IP of my
seed node listed:


seed_provider:

- class_name: org.apache.cassandra.locator.SimpleSeedProvider

  # seeds is actually a comma-delimited list of addresses.

  - seeds: 10.10.1.94


So I'm just wondering what I'm missing in trying to get these two
nodes to communicate via gossip at this point.


Thanks!

Tim








On Sat, May 17, 2014 at 8:54 PM, Dave Brosius dbros...@mebigfatguy.comwrote:

  What Colin is saying is that the tool you used to create the token, is
 not creating tokens usable for the Murmur3Partitioner. That tool is
 probably generating tokens for the (original) RandomPartitioner, which has
 a different range.



 On 05

Re: What % of cassandra developers are employed by Datastax?

2014-05-16 Thread Colin

I used cassandra for years at NYSE and we were able to do what we wanted with 
cassandra by leveraging open source and internal development knowing that 
cassandra did what we wanted it to do and that no one could ever take the code 
away from us in a worst case scenario.

Compare and contrast that with the pure proprietary model, and I'm sure it will 
help you sleep easier.

--
Colin Clark 
+1-320-221-9531
 

 On May 15, 2014, at 10:52 AM, Jack Krupansky j...@basetechnology.com 
 wrote:
 
 You can always check the project committer wiki:
 http://wiki.apache.org/cassandra/Committers
  
 -- Jack Krupansky
  
 From: Kevin Burton
 Sent: Wednesday, May 14, 2014 4:39 PM
 To: user@cassandra.apache.org
 Subject: What % of cassandra developers are employed by Datastax?
  
 I'm curious what % of cassandra developers are employed by Datastax?
  
 … vs other companies.
  
 When MySQL was acquired by Oracle this became a big issue because even though 
 you can't really buy an Open Source project, you can acquire all the 
 developers and essentially do the same thing.
  
 It would be sad if all of Cassandra's 'eggs' were in one basket and a similar 
 situation happens with Datastax.
  
 Seems like they're doing an awesome job to be sure but I guess it worries me 
 in the back of my mind.
  
  
  
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.

Re: How safe is nodetool move in 1.2 ?

2014-04-16 Thread Colin

I have recently tested this scenario under a couple versions of Cassandra and 
have been able to write and read to/from the cluster while performing a move.

I performed these tests utilizing an RF=2 on a three node cluster while 
performing quorum reads and received no errors due to unavailable replicas.

I will be doing some more testing on this under different scenarios, but so far 
so good

However, I would strongly recommend an RF of at least 3 when performing quorum 
based reads because otherwise you're subject to failed reads in event of losing 
one node.

--
Colin
320-221-9531


 On Apr 16, 2014, at 6:28 PM, Richard Low rich...@wentnet.com wrote:
 
 On 16 April 2014 05:08, Jonathan Lacefield jlacefi...@datastax.com wrote:
 Assuming you have enough nodes not undergoing move to meet your CL 
 requirements, then yes, your cluster will still accept reads and writes.   
 However, it's always good to test this before doing it in production to 
 ensure your cluster and app will function as designed.
 
 This is not a correctness requirement: writes go to the move source and 
 destination during the move and reads come from the source. Otherwise you 
 could lose data during move (and certainly would lose data if replication 
 factor was one). However, nodes that are involved in the move will be slower 
 so it will be better for performance to not move nodes that share replicas 
 simultaneously.
 
 Richard.

[ANN] ccm-clj - test Cassandra clusters via Clojure

2014-04-08 Thread Colin Taylor

Hi, we have released ccm-clj (https://github.com/SMX-LTD/ccm-clj) a Clojure
interface to ccm (https://github.com/pcmanus/ccm) designed specifically for
the creation of arbitrary clusters in Clojure integration tests.

(if (not (ccm/cluster? testcluster))
  (do
(ccm/new! testcluster cass-version num-nodes cql-port)
(ccm/cql! (io/file ./test/resources/test-keyspace.cql))
(ccm/cql! (io/resource schema/test-schema.cql) testkeyspace)
(ccm/cql! (io/file ./test/resources/test-data.cql) testkeyspace))
  (do
(ccm/switch! testcluster)
(ccm/start! testcluster)))

(ccm/remove! testcluster)

cheers
Colin Taylor

Re: Row_key from sstable2json to actual value of the key

2014-04-03 Thread Colin Blower

Hey ng,

You can use CQL and Cassandra do the conversion if you would like. If
your table uses int type keys:
 select * from tomcat.tx where c1 = blobAsInt(0x0021);

The relevant section of the CQL3 docs are here:
http://cassandra.apache.org/doc/cql3/CQL.html#blobFun

You can use blobAs... for any type. I hope this help.


On 04/03/2014 08:50 AM, ng wrote:
 sstable2json tomcat-t5-ic-1-Data.db -e
 gives me
  
 0021
 001f
 0020
  
  
 How do I convert this (hex) to actual value of column so I can do below
  
 select * from tomcat.t5 where c1='concerted value';
  
 Thanks in advance for the help.



-- 
*Colin Blower*
/Software Engineer/
Barracuda Networks Inc.
+1 408-342-5576 (o)

===

Find out how eSigning generates significant financial benefit.
Read the Barracuda SignNow ROI whitepaper at 
https://signnow.com/l/business/esignature_roi

Re: (SOLVED) Installing Datastax Cassandra 1.2.15 Using Yum (Java Issue)

2014-03-28 Thread Colin

OpenJDK will crash under load whilst running Cassandra.  

--
Colin 
+1 320 221 9531

 

 On Mar 28, 2014, at 4:11 PM, Jon Forrest jon.forr...@xoom.com wrote:
 
 In a previous message I described my guess at
 what was causing the Datastax Cassandra installation
 to require OpenJDK. Using the method I describe below,
 I'm now able to install the Datastax Cassandra rpm.
 Note that I have no idea (yet) whether Cassandra actually
 runs, but at least it installs.
 
 There's a wonderful opensource program out there called
 rpmrebuild. It lets you examine and modify the metadata
 in an rpm, including the dependencies. So, I ran
 
 rpmrebuild -e -p cassandra12-1.2.15-1.noarch.rpm
 
 This puts me in an editor with the spec file loaded.
 I searched for 'java' and found the line
 
 Requires:  java = 1.6.0
 
 I changed this line to
 
 Requires:  jdk = 1.6.0
 
 I wrote out the file and exited the editor. This created
 /root/rpmbuild/RPMS/noarch/cassandra12-1.2.15-1.noarch.rpm
 which I put in a local yum repo. I was then able to install
 this using yum and I was able to start Cassandra. Problem solved!
 
 Now I'm on to test whether this installation really works.
 
 Jon Forrest
 
 
 
 The information transmitted in this email is intended only for the person or 
 entity to which it is addressed, and may contain material confidential to 
 Xoom Corporation, and/or its subsidiary, buyindiaonline.com Inc. Any review, 
 retransmission, dissemination or other use of, or taking of any action in 
 reliance upon, this information by persons or entities other than the 
 intended recipient(s) is prohibited. If you received this email in error, 
 please contact the sender and delete the material from your files.

Re: Which hector version is suitable for cassandra 2.0.6 ?

2014-03-27 Thread Colin

Have you tried the Datastax java driver?

--
Colin
320-221-9531


 On Mar 27, 2014, at 8:17 AM, user 01 user...@gmail.com wrote:
 
 Which hector version is suitable for cassandra 2.0.6 ? 
 
 I am seeing that version 1.1-4(which I believe is latest release?) has been 
 there around since very long time even before C* 2.x.x series wasn't out.  So 
 is it the latest release suitable for 2.0 series? 
 
 Is Hector under active development nowadays ?

Re: CQL decimal encoding

2014-02-25 Thread Colin Blower

Hey Ben,

It looks like you are trying to implement the Decimal type. You might
want to start with implementing the Integer type. The Decimal type
follows pretty easily from the Integer type.

For example:
i = unmarchalInteger(data[4:])
s = decInt(data[0:4])
out = inf.newDec(i, s)

On 02/24/2014 09:51 AM, Ben Hood wrote:
 Hey Peter,

 On Mon, Feb 24, 2014 at 5:25 PM, Peter Lin wool...@gmail.com wrote:
 Not sure what you mean by the question.

 Are you talking about the structure of BigDecimal in java? If that is your
 question, the java's BigDecimal uses the first 4 bytes for scale and
 remaining bytes for BigInteger
 I'm talking about the encoding of an arbitrary precision value in a
 platform neutral fashion such that interoperability between different
 language bindings is assured.

 Say you have an Java app writing to Cassandra and a Python app reading
 this data back out - ideally all language bindings would pack and
 unpack the data in an interoperable fashion. Also, I'm not sure what
 restrictions the server imposes on the encoding of the decimal type -
 can you for example just write any old (unchecked) bogus data into a
 decimal column via CQL?

 My situation is that I'm implementing the marshalling for the gocql
 driver, which is a generic CQL driver for Go. So ideally I'd like to
 provide an implementation that is generic across all applications.

 I'm using the class big.Rat from the Go standard library, which
 provides a similar interface to BigDecimal in Java and decimal.Decimal
 in Python. It has it's own encoding/decoding functions, but this
 format is specific to Go binary encoding and hence is not portable.

 So I have taken cue from 4 byte scale/variable length numerator
 strategy used by the Java BigDecimal and I've got something going
 using that: https://github.com/gocql/gocql/pull/120

 I guess I was looking for some kind of spec for the on-wire format of
 the decimal type.

 Or in the absence of a spec, just a heads up from other language
 driver implementors as to what approach they've taken.

 Does this make sense?

 Cheers,

 Ben


-- 
*Colin Blower*
/Software Engineer/
Barracuda Networks Inc.
+1 408-342-5576 (o)

===

Find out how eSigning generates significant financial benefit.
Read the Barracuda SignNow ROI whitepaper at 
https://signnow.com/l/business/esignature_roi

Re: How do you remote backup your cassandra nodes ?

2014-02-21 Thread Colin Blower

You might want to use the Priam tool for backups.
https://github.com/Netflix/Priam

If you don't want to use Priam, you should read this Datastax entry on
backup and restore.
http://www.datastax.com/docs/1.0/operations/backup_restore

On 02/21/2014 11:19 AM, user 01 wrote:
 I'm wanting to back up my data to amazon S3. Can anyone please tell
 about which directories should I copy to the remote location for
 backup so as to restore the entire Cassandra data in the event of any
 failures?


 On Fri, Feb 21, 2014 at 1:43 AM, user 01 user...@gmail.com
 mailto:user...@gmail.com wrote:

 What is your strategy/tools set to backup your Cassandra nodes,
 apart from from cluster replication/ snapshots within cluster?



-- 
*Colin Blower*
/Software Engineer/
Barracuda Networks Inc.

===

Find out how eSigning generates significant financial benefit.
Read the Barracuda SignNow ROI whitepaper at 
https://signnow.com/l/business/esignature_roi

Re: OpenJDK is not recommended? Why

2014-01-28 Thread Colin

Open jdk has known issues and they will raise their ugly little head from time 
to time-i have experienced them myself.

To be safe, I would use the latest oracle 7 release.

You may also be experiencing a configuration issue, make sure one node is 
specified as the seed node and that the other node knows that address as well.  
There's a good guide to configuration on the datastax website.

--
Colin 
+1 320 221 9531

 

 On Jan 28, 2014, at 9:55 PM, Kumar Ranjan winnerd...@gmail.com wrote:
 
 I am in process of setting 2 node cluster with C* version 2.0.4. When I 
 started each node, it failed to communicate thus, each are running separate 
 and not in same ring. So started looking at the log files are saw the message 
 below:
 
 WARN [main] 2014-01-28 06:02:17,861 CassandraDaemon.java (line 155) OpenJDK 
 is not recommended. Please upgrade to the newest Oracle Java release
 
 Is this message informational only or can it be real issue? Is this why, two 
 nodes are not in a ring?
 
 -- Kumar

Re: Datamodel for a highscore list

2014-01-23 Thread Colin Clark

Most of the work I've done like this has used sparse table definitions and
the empty column trick.  I didn't explain that very well in my last
response.

I think by using the userid as the rowid, and using the friend id as the
column name with the score, that I would put an entire user's friend list
on one row.  The row would look like this:

ROWID
USERID

Colin
+1 320 221 9531



On Thu, Jan 23, 2014 at 2:34 AM, Kasper Middelboe Petersen 
kas...@sybogames.com wrote:

 What would the consequence be of having this updated highscore table
 (using friendId as part of the clustering index to avoid name collisions):

 CREATE TABLE highscore (
   userId uuid,
   score int,
   friendId uuid,
   name varchar,
   PRIMARY KEY(userId, score, friendId)
 ) WITH CLUSTERING ORDER BY (score DESC);

 And then create an index:

 CREATE INDEX friendId_idx ON highscore ( friendId );

 The table will have many million (I should expect 100+ million) entries.
 Each friendId would appear as many times as the user has friends. It sounds
 like a scenario where I should take care of using a custom index.

 I haven't worked with custom indexes in Cassandra before, but I assume
 this would allow me to query the table based on (userId, friendId) for
 updating highscores.

 But what would happen in this case? What queries would be affected and
 roughly to what degree?

 Would this be a viable option?



 On Wed, Jan 22, 2014 at 6:44 PM, Kasper Middelboe Petersen 
 kas...@sybogames.com wrote:

 Hi!

 I'm a little worried about the data model I have come up with for
 handling highscores.

 I have a lot of users. Each user has a number of friends. I need a
 highscore list pr friend list.

 I would like to have it optimized for reading the highscores as opposed
 to setting a new highscore as the use case would suggest I would need to
 read the list a lot more than I would need write new highscores.

 Currently I have the following tables:
 CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
 int, PRIMARY KEY(userId))
 CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
 KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
 ... and a tables for friends - for the purpose of this mail assume
 everyone is friends with everyone else

 Reading the highscore list for a given user is easy. SELECT * FROM
 highscores WHERE userId = id.

 Problem is setting a new highscore.
 1. I need to read-before-write to get the old score
 2. I'm screwed if something goes wrong and the old score gets overwritten
 before all the friends highscore lists gets updated - and it is an highly
 visible error due to the same user is on the highscore multiple times.

 I would very much appreciate some feedback and/or alternatives to how to
 solve this with Cassandra.


 Thanks,
 Kasper

Re: Datamodel for a highscore list

2014-01-23 Thread Colin Clark

One of tricks I've used a lot with cassandra is a sparse df definition and
inserted columns programmatically that weren't in the definition.

I'd be tempted to look at putting a users friend list on one row, the row
would look like this:

ROWIDCOLUMNS

UserID UserId, UserID, UserScore:Score FriendID, score
 FriendID,   score 

The UserID and UserScore columns are literal, the FriendID's are either
literal or keys into the user cf.

When a user gets a new score, you update that user's row and a general
update query updating all rows with that userid with the new score

That way, all friends are on the same row, which makes query easy.  And you
can still issue query to find the top score across the entire userbase by
querying userid, and userscore.

Is this a better explanation of my previous and lame explanation?

Colin
+1 320 221 9531



On Thu, Jan 23, 2014 at 2:34 AM, Kasper Middelboe Petersen 
kas...@sybogames.com wrote:

 What would the consequence be of having this updated highscore table
 (using friendId as part of the clustering index to avoid name collisions):

 CREATE TABLE highscore (
   userId uuid,
   score int,
   friendId uuid,
   name varchar,
   PRIMARY KEY(userId, score, friendId)
 ) WITH CLUSTERING ORDER BY (score DESC);

 And then create an index:

 CREATE INDEX friendId_idx ON highscore ( friendId );

 The table will have many million (I should expect 100+ million) entries.
 Each friendId would appear as many times as the user has friends. It sounds
 like a scenario where I should take care of using a custom index.

 I haven't worked with custom indexes in Cassandra before, but I assume
 this would allow me to query the table based on (userId, friendId) for
 updating highscores.

 But what would happen in this case? What queries would be affected and
 roughly to what degree?

 Would this be a viable option?



 On Wed, Jan 22, 2014 at 6:44 PM, Kasper Middelboe Petersen 
 kas...@sybogames.com wrote:

 Hi!

 I'm a little worried about the data model I have come up with for
 handling highscores.

 I have a lot of users. Each user has a number of friends. I need a
 highscore list pr friend list.

 I would like to have it optimized for reading the highscores as opposed
 to setting a new highscore as the use case would suggest I would need to
 read the list a lot more than I would need write new highscores.

 Currently I have the following tables:
 CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
 int, PRIMARY KEY(userId))
 CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
 KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
 ... and a tables for friends - for the purpose of this mail assume
 everyone is friends with everyone else

 Reading the highscore list for a given user is easy. SELECT * FROM
 highscores WHERE userId = id.

 Problem is setting a new highscore.
 1. I need to read-before-write to get the old score
 2. I'm screwed if something goes wrong and the old score gets overwritten
 before all the friends highscore lists gets updated - and it is an highly
 visible error due to the same user is on the highscore multiple times.

 I would very much appreciate some feedback and/or alternatives to how to
 solve this with Cassandra.


 Thanks,
 Kasper

Re: Datamodel for a highscore list

2014-01-22 Thread Colin

Read users score, increment, update friends list, update user with new high 
score

Would that work?

--
Colin 
+1 320 221 9531

 

 On Jan 22, 2014, at 11:44 AM, Kasper Middelboe Petersen 
 kas...@sybogames.com wrote:
 
 Hi!
 
 I'm a little worried about the data model I have come up with for handling 
 highscores.
 
 I have a lot of users. Each user has a number of friends. I need a highscore 
 list pr friend list.
 
 I would like to have it optimized for reading the highscores as opposed to 
 setting a new highscore as the use case would suggest I would need to read 
 the list a lot more than I would need write new highscores.
 
 Currently I have the following tables:
 CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo int, 
 PRIMARY KEY(userId))
 CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY 
 KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
 ... and a tables for friends - for the purpose of this mail assume everyone 
 is friends with everyone else
 
 Reading the highscore list for a given user is easy. SELECT * FROM highscores 
 WHERE userId = id.
 
 Problem is setting a new highscore.
 1. I need to read-before-write to get the old score
 2. I'm screwed if something goes wrong and the old score gets overwritten 
 before all the friends highscore lists gets updated - and it is an highly 
 visible error due to the same user is on the highscore multiple times.
 
 I would very much appreciate some feedback and/or alternatives to how to 
 solve this with Cassandra.
 
 
 Thanks,
 Kasper

Re: Datamodel for a highscore list

2014-01-22 Thread Colin Clark

How many users and how many games?

--
Colin
+1 320 221 9531



On Jan 22, 2014, at 10:59 AM, Kasper Middelboe Petersen 
kas...@sybogames.com wrote:

I can think of two cases where something bad would happen in this case:
1. Something bad happens after the increment but before some or all of the
update friend list is finished
2. Someone spams two scores at the same time creating a race condition
where one of them could have a score that is not yet updated (or the old
score, depending on if the increment of the highscore is done before or
after the friend updates)

Both are unlikely things to have happen often, but I'm going to have quite
a few users using the system and it would be bound to happen and I would
really like to avoid having data corruption (especially of the kind that is
also obvious to the users) if it can at all be avoided.

Also should it happen there is no way to neither detect nor clean it up.


On Wed, Jan 22, 2014 at 6:48 PM, Colin colpcl...@gmail.com wrote:

 Read users score, increment, update friends list, update user with new
 high score

 Would that work?

 --
 Colin
 +1 320 221 9531



  On Jan 22, 2014, at 11:44 AM, Kasper Middelboe Petersen 
 kas...@sybogames.com wrote:
 
  Hi!
 
  I'm a little worried about the data model I have come up with for
 handling highscores.
 
  I have a lot of users. Each user has a number of friends. I need a
 highscore list pr friend list.
 
  I would like to have it optimized for reading the highscores as opposed
 to setting a new highscore as the use case would suggest I would need to
 read the list a lot more than I would need write new highscores.
 
  Currently I have the following tables:
  CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
 int, PRIMARY KEY(userId))
  CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
 KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
  ... and a tables for friends - for the purpose of this mail assume
 everyone is friends with everyone else
 
  Reading the highscore list for a given user is easy. SELECT * FROM
 highscores WHERE userId = id.
 
  Problem is setting a new highscore.
  1. I need to read-before-write to get the old score
  2. I'm screwed if something goes wrong and the old score gets
 overwritten before all the friends highscore lists gets updated - and it is
 an highly visible error due to the same user is on the highscore multiple
 times.
 
  I would very much appreciate some feedback and/or alternatives to how to
 solve this with Cassandra.
 
 
  Thanks,
  Kasper

Re: Datamodel for a highscore list

2014-01-22 Thread Colin

One way might be to use userid as a rowid, and then put all of the friends with 
their scores on the same row.  You could even update the column entry like this

Score:username or Id

This way the columns would come back sorted when reading the high scores for 
the group.

To update set that uses score in that users row after reading it for update.

So each row would look like this

Rowkey - userid
Columns would be userid:score followed by friendid:score

This way, you could also get global high score list

Each user would have their own row

If multiple games, create userid+gameid as rowkey

Might this work?


--
Colin 
+1 320 221 9531

 

 On Jan 22, 2014, at 11:13 AM, Kasper Middelboe Petersen 
 kas...@sybogames.com wrote:
 
 Many million users. Just the one game- I might have some different scores I 
 need to keep track of, but I very much hope to be able to use the same 
 approach for those as for the high score mentioned here.
 
 
 On Wed, Jan 22, 2014 at 7:08 PM, Colin Clark co...@clark.ws wrote:
 How many users and how many games?
 
 
 --
 Colin 
 +1 320 221 9531
 
  
 
 On Jan 22, 2014, at 10:59 AM, Kasper Middelboe Petersen 
 kas...@sybogames.com wrote:
 
 I can think of two cases where something bad would happen in this case:
 1. Something bad happens after the increment but before some or all of the 
 update friend list is finished
 2. Someone spams two scores at the same time creating a race condition 
 where one of them could have a score that is not yet updated (or the old 
 score, depending on if the increment of the highscore is done before or 
 after the friend updates)
 
 Both are unlikely things to have happen often, but I'm going to have quite 
 a few users using the system and it would be bound to happen and I would 
 really like to avoid having data corruption (especially of the kind that is 
 also obvious to the users) if it can at all be avoided.
 
 Also should it happen there is no way to neither detect nor clean it up.
 
 
 On Wed, Jan 22, 2014 at 6:48 PM, Colin colpcl...@gmail.com wrote:
 Read users score, increment, update friends list, update user with new 
 high score
 
 Would that work?
 
 --
 Colin
 +1 320 221 9531
 
 
 
  On Jan 22, 2014, at 11:44 AM, Kasper Middelboe Petersen 
  kas...@sybogames.com wrote:
 
  Hi!
 
  I'm a little worried about the data model I have come up with for 
  handling highscores.
 
  I have a lot of users. Each user has a number of friends. I need a 
  highscore list pr friend list.
 
  I would like to have it optimized for reading the highscores as opposed 
  to setting a new highscore as the use case would suggest I would need to 
  read the list a lot more than I would need write new highscores.
 
  Currently I have the following tables:
  CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo 
  int, PRIMARY KEY(userId))
  CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY 
  KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
  ... and a tables for friends - for the purpose of this mail assume 
  everyone is friends with everyone else
 
  Reading the highscore list for a given user is easy. SELECT * FROM 
  highscores WHERE userId = id.
 
  Problem is setting a new highscore.
  1. I need to read-before-write to get the old score
  2. I'm screwed if something goes wrong and the old score gets 
  overwritten before all the friends highscore lists gets updated - and it 
  is an highly visible error due to the same user is on the highscore 
  multiple times.
 
  I would very much appreciate some feedback and/or alternatives to how to 
  solve this with Cassandra.
 
 
  Thanks,
  Kasper

Re: Tracking word frequencies

2014-01-20 Thread Colin

When updating, use table that uses rows of words and increment the count?

--
Colin 
+1 320 221 9531

 

 On Jan 20, 2014, at 6:58 AM, David Tinker david.tin...@gmail.com wrote:
 
 I haven't actually tried to use that schema yet, it was just my first idea. 
 If we use that solution our app would have to read the whole table once a day 
 or so to find the top 5000'ish words.
 
 
 On Fri, Jan 17, 2014 at 2:49 PM, Jonathan Lacefield 
 jlacefi...@datastax.com wrote:
 Hi David,
 
   How do you know that you are receiving a seek for each row?  Are you 
 querying for a specific word at a time or do the queries span multiple 
 words, i.e. what's the query pattern? Also, what is your goal for read 
 latency?  Most customers can achieve microsecond partition key base query 
 reads with Cassanda.  This can be done through tuning, data modeling, and/or 
 scaling.  Please post a cfhistograms for this table as well as provide some 
 details on the specific queries you are running.
 
 Thanks,
 
 Jonathan
 
 Jonathan Lacefield
 Solutions Architect, DataStax
 (404) 822 3487
 
 
 
 
 
 
 On Fri, Jan 17, 2014 at 1:41 AM, David Tinker david.tin...@gmail.com 
 wrote:
 I have an app that stores lots of bits of text in Cassandra. One of
 the things I need to do is keep a global word frequency table.
 Something like this:
 
 CREATE TABLE IF NOT EXISTS word_count (
   word text,
   count value,
   PRIMARY KEY (word)
 );
 
 This is slow to read as the rows (100's of thousands of them) each
 need a seek. Is there a better way to model this in Cassandra? I could
 periodically snapshot the rows into a fat row in another table I
 suppose.
 
 Or should I use Redis or something instead? I would prefer to keep it
 all Cassandra if possible.
 
 
 
 -- 
 http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ Integration

Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Colin MacDonald

Ahoy the list.  I am evaluating Cassandra in the context of using it as a 
storage back end for the Titan graph database.

We’ll have several nodes in the cluster.  However, one of our requirements is 
that data has to be loaded into and stored on a specific node and only on that 
node.  Also, it cannot be replicated around the system, at least not stored 
persistently on disk – we will of course make copies in memory and on the wire 
as we access remote notes.  These requirements are non-negotiable.

We understand that this is essentially the opposite of what Cassandra is 
designed for, and that we’re missing all the scalability and robustness, but is 
it technically possible?

First, I would need to create a custom partitioner – is there any tutorial on 
that?  I see a few “you don’t need” to threads, but I do.

Second, how easy is it to have Cassandra not replicate data between nodes in a 
cluster?  I’m not seeing an obvious configuration option for that, presumably 
because it obviates much of the point of using Cassandra, but again, we’re 
working within some rather unfortunate constraints.

Any hints or suggestions would be most gratefully received.

Kind regards,

-Colin MacDonald-

RE: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Colin MacDonald

 -Original Message-
 From: Janne Jalkanen [mailto:janne.jalka...@ecyrd.com]
 
 Essentially you want to turn off all the features which make Cassandra a
 robust product ;-).

Oh, I don't want to, but sadly those are the requirements that I have to work 
with.

Again, the context is using it as the storage back for a graph database.  I'm 
currently looking at the Titan graph DBMS, which supports the use of Cassandra 
or HBase for a distributed graph, both of which will need to be hobbled to 
prevent them working the way they're designed.

So it really is a question of: *can* I cripple Cassandra in this way, and if so 
how?

Thanks for the response.

-Colin MacDonald-

RE: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Colin MacDonald

 -Original Message-
 From: Sylvain Lebresne [mailto:sylv...@datastax.com]
 Sent: 18 December 2013 10:45
 
 You seem to be well aware that you're not looking at using Cassandra for
 what it is designed for (which obviously imply you'll need to expect under-
 optimal behavior), so I'm not going to insist on it.

Very kind of you. ;)

I'm suspect that that this requirement is viscerally horrifying, but as I said, 
it's idiosyncratic, specified by an... idiosyncrat.

It's a pragmatic solution that I'm looking for, just to get a proof of concept 
going, it doesn't have to be elegant at this stage.

 As to how you could achieve that, a relatively simple solution (that do not
 require writing your own partitioner) would consist in using 2 datacenters
 (that obviously don't have to be real physical datacenter), to put the one 
 that
 should have it all in one datacenter with RF=1 and to pull all other nodes in
 the other datacenter with RF=0.
 
 As Janne said, you could still have hint being written by other nodes if the
 one storage node is dead, but you can use the system property
 cassandra.maxHintTTL to 0 to disable hints.

Thanks Sylvain, I'll look into that.  I'm coming to Cassandra cold, I hadn't 
even spotted that the replication factor was configurable - I don't see an 
option for in the cassandra.yaml that came with 2.0.2.  I should be able to 
figure it out though, and that's great news, it looks like it takes care of one 
issue.

However, I'm not immediately seeing how to control which node will get the 
single copy of the data.  Won't the partitioner still allocate data around the 
cluster?

Ah, is a datacentre a logical group *within* an overall cluster?  So I can 
create a separate datacentre for each node, and if I write to that node the 
data will be forced to stay in that datacentre, i.e. that node?

I do apologise for the noobish questions, my attention is currently split 
between investigating several possible solutions.  I rather favour Cassandra 
though, if I can hobble it appropriately.

Kind regards,

-Colin MacDonald-

RE: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Colin MacDonald

 -Original Message-
 From: Sylvain Lebresne [mailto:sylv...@datastax.com]
 Sent: 18 December 2013 12:46
 Google up NetworkTopologyStrategy. This is what you want to use and it's
 not configured in cassandra.yaml but when you create the keyspace.
 
 Basically, you define your topology in cassandra-topology.yaml (where you
 basically manually set which node is in which DC, which you can really just
 see as assigning nodes to named groups) and then you can define the
 replication factor for each DC (so if RF=1 on the 1 node group and 0 on the
 other nodes group, C* will gladly honor it and store no data on node of the
 other nodes group).
 
 --
 Sylvain

Thank you so much, that's clear and helpful.  I appreciate you taking the time 
to explain it.

-Colin-

Re: Cassandra book/tuturial

2013-10-27 Thread Colin

I wouldnt buy that book, it's old and not too useful.


Find some tutorials and dive in.

 

 On Oct 27, 2013, at 8:54 PM, Mohan L l.mohan...@gmail.com wrote:
 
 
 
 
 On Sun, Oct 27, 2013 at 9:57 PM, Erwin Karbasi er...@optinity.com wrote:
 Hey Guys,
 
 What is the best book to learn Cassandra from scratch?
 
 Thanks in advance,
 Erwin
 
 Hi,
 
 Buy :
 
 Cassandra: The Definitive Guide By Eben Hewitt : 
 http://shop.oreilly.com/product/0636920010852.do
 
 Thanks
 Mohan L

Re: disappointed

2013-07-25 Thread Colin

Mysql?

--
Colin 
+1 320 221 9531

 

On Jul 25, 2013, at 6:08 AM, Derek Andree dand...@lacunasystems.com wrote:

 Yeah, Rob is smart.  don't run crap in production.  Run what others are 
 stable at.  If you are running the latest greatest dumbest craziest in prod 
 then you ask for fail, and you will get just that.
 
 FAIL
 
 On Jul 24, 2013, at 12:06 PM, Robert Coli rc...@eventbrite.com wrote:
 
 A better solution would likely involve not running cutting edge code in 
 production.. if you find yourself needing to upgrade production anything on 
 the day of a release, you are probably ahead of the version it is reasonable 
 to run in production.
 
 If you're already comfortable with this high level of risk in production, I 
 don't really see small manual patches as significantly increasing your level 
 of risk...
 
 =Rob

Re: sstable size ?

2013-07-17 Thread Colin Blower

Take a look at the very recent thread called 'Alternate major
compaction'. There are some ideas in there about splitting up a large
SSTable.

http://www.mail-archive.com/user@cassandra.apache.org/msg30956.html


On 07/17/2013 04:17 PM, Langston, Jim wrote:
 Hi all,

 Is there a way to get an SSTable to a smaller size ? By this I mean
 that I 
 currently have an SSTable that is nearly 1.2G, so that subsequent SSTables
 when they compact are trying to grow to that size. The result is that
 when 
 the min_compaction_threshold reaches it value and a compaction is needed, 
 the compaction is taking a long time as the file grows (it is
 currently at 52MB and
 takes ~22s to compact).

 I'm not sure how the SSTable initially grew to its current size of
 1.2G, since the
 servers have been up for a couple of years. I hadn't noticed until I
 just upgraded to 1.2.6,
 but now I see it affects everything. 


 Jim


-- 
*Colin Blower*
/Software Engineer/
Barracuda Networks Inc.
+1 408-342-5576 (o)

Re: Token Aware Routing: Routing Key Vs Composite Key with vnodes

2013-07-11 Thread Colin Blower

It is my understanding that you must have all parts of the partition key
in order to calculate the token. The partition key is the first part of
the primary key, in your case the userId.

You should be able to get the token from the userId. Give it a try:
cqlsh select userId, token(userId) from users limit 10;

On 07/11/2013 08:54 AM, Haithem Jarraya wrote:
 Hi All,

 I am a bit confused on how the underlying token aware routing is
 working in the case of composite key.
 Let's say I have a column family like this USERS( uuid userId, text
 firstname, text lastname, int age, PRIMARY KEY(userId, firstname,
 lastname))

 My question is do we need to have the values of the userId, firstName
 and lastName available in the same time to create the token from the
 composite key, or we can get the right token just by looking at the
 routing key userId?

 Looking at the datastax driver code, is a bit confusing, it seems that
 it calculate the token only when all the values of a composite key is
 available, or I am missing something?

 Thanks,

 Haithem



-- 
*Colin Blower*
/Software Engineer/
Barracuda Networks Inc.
+1 408-342-5576 (o)

Re: Date range queries

2013-06-25 Thread Colin Blower

You could just separate the history data from the current data. Then
when the user's result is updated, just write into two tables.

CREATE TABLE all_answers (
  user_id uuid,
  created timeuuid,
  result text,
  question_id varint,
  PRIMARY KEY (user_id, created)
)

CREATE TABLE current_answers (
  user_id uuid,
  question_id varint,
  created timeuuid,
  result text,
  PRIMARY KEY (user_id, question_id)
)


 select * FROM current_answers ;
 user_id  | question_id | result | created
--+-++--
 11b1e59c-ddfa-11e2-a28f-0800200c9a66 |   1 | no |
f9893ee0-ddfa-11e2-b74c-35d7be46b354
 11b1e59c-ddfa-11e2-a28f-0800200c9a66 |   2 |   blah |
f7af75d0-ddfa-11e2-b74c-35d7be46b354

 select * FROM all_answers ;
 user_id  |
created  | question_id | result
--+--+-+
 11b1e59c-ddfa-11e2-a28f-0800200c9a66 |
f0141234-ddfa-11e2-b74c-35d7be46b354 |   1 |yes
 11b1e59c-ddfa-11e2-a28f-0800200c9a66 |
f7af75d0-ddfa-11e2-b74c-35d7be46b354 |   2 |   blah
 11b1e59c-ddfa-11e2-a28f-0800200c9a66 |
f9893ee0-ddfa-11e2-b74c-35d7be46b354 |   1 | no

This way you can get the history of answers if you want and there is a
simple way to get the most current answers.

Just a thought.
-Colin B.


On 06/24/2013 03:28 PM, Christopher J. Bottaro wrote:
 Yes, that makes sense and that article helped a lot, but I still have
 a few questions...

 The created_at in our answers table is basically used as a version id.
  When a user updates his answer, we don't overwrite the old answer,
 but rather insert a new answer with a more recent timestamp (the version).

 answers
 ---
 user_id | created_at | question_id | result
 ---
   1 | 2013-01-01 | 1   | yes
   1 | 2013-01-01 | 2   | blah
   1 | 2013-01-02 | 1   | no

 So the queries we really want to run are find me all the answers for
 a given user at a given time.  So given the date of 2013-01-02 and
 user_id 1, we would want rows 2 and 3 returned (since rows 3 obsoletes
 row 1).  Is it possible to do this with CQL given the current schema?

 As an aside, we can do this in Postgresql using window functions, not
 standard SQL, but pretty neat.

 We can alter our schema like so...

 answers
 ---
 user_id | start_at | end_at | question_id | result

 Where the start_at and end_at denote when an answer is active.  So the
 example above would become:

 answers
 ---
 user_id | start_at   | end_at | question_id | result
 
   1 | 2013-01-01 | 2013-01-02 | 1   | yes
   1 | 2013-01-01 | null   | 2   | blah
   1 | 2013-01-02 | null   | 1   | no

 Now we can query SELECT * FROM answers WHERE user_id = 1 AND start_at
 = '2013-01-02' AND (end_at  '2013-01-02' OR end_at IS NULL).

 How would one define the partitioning key and cluster columns in CQL
 to accomplish this?  Is it as simple as PRIMARY KEY (user_id,
 start_at, end_at, question_id) (remembering that we sometimes want to
 limit by question_id)?

 Also, we are a bit worried about race conditions.  Consider two
 separate processes updating an answer for a given user_id /
 question_id.  There will be a race condition between the two to update
 the correct row's end_at field.  Does that make sense?  I can draw it
 out with ASCII tables, but I feel like this email is already too
 long... :P

 Thanks for the help.



 On Wed, Jun 19, 2013 at 2:28 PM, David McNelis dmcne...@gmail.com
 mailto:dmcne...@gmail.com wrote:

 So, if you want to grab by the created_at and occasionally limit
 by question id, that is why you'd use created_at.

 The way the primary keys work is the first part of the primary key
 is the Partioner key, that field is what essentially is the single
 cassandra row.  The second key is the order preserving key, so you
 can sort by that key.  If you have a third piece, then that is the
 secondary order preserving key.

 The reason you'd want to do (user_id, created_at, question_id) is
 because when you do a query on the keys, if you MUST use the
 preceding pieces of the primary key.  So in your case, you could
 not do a query with just user_id and question_id with the
 user-created-question key.  Alternatively if you went with
 (user_id, question_id, created_at), you would not be able to
 include a range of created_at unless you were also filtering on
 the question_id.

 Does that make sense?

 As for the large rows, 10k is unlikely to cause you too many
 issues (unless the answer is potentially a big blob of text).
  Newer versions of cassandra deal with a lot of things

Starting up Cassandra occurred errors after upgrading Cassandra to 1.2.5 from 1.0.12

2013-05-29 Thread Colin Kuo

Hi All,

We followed the upgrade guide(
http://www.datastax.com/docs/1.2/install/upgrading) from Datastax web site
and upgraded Cassadra to 1.2.5, but it occurred errors in system.log when
starting up.

After digging into code level, it looks like Cassandra found the file
length of IndexSummary sstable is zero. Thus Cassandra threw
AssertionError. In fact, the file length of the IndexSummary is about 80
bytes, not zero. It's weird.

Also we observed that only happens on the IndexSummary file of secondary
index. The errors can be reproducible. Below are my upgrade steps.
1. Shutdown all of client applications.
2. Run nodetool drain before shutting down the existing Cassandra service.
3. Stop old Cassandra process, then start the new binary process using
migrated cassandra.yaml.
4. Run nodetool upgradesstables -a in order to upgrade all of sstable
files become new format.
5. Restart Cassandra process and monitor the logs file for any issues.
At step 5, we found the error messages as below.

Any ideas?

Thank you!
Colin

===
 INFO [SSTableBatchOpen:2] 2013-05-29 04:38:40,085 SSTableReader.java (line
169) Opening
/var/lib/cassandra/data/ks/user/ks-user.ks_user_personalID-ic-61 (58 bytes)
ERROR [SSTableBatchOpen:1] 2013-05-29 04:38:40,085 CassandraDaemon.java
(line 175) Exception in thread Thread[SSTableBatchOpen:1,5,main]
java.lang.AssertionError
at
org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:401)
at
org.apache.cassandra.io.sstable.IndexSummary$IndexSummarySerializer.deserialize(IndexSummary.java:124)
 at
org.apache.cassandra.io.sstable.SSTableReader.loadSummary(SSTableReader.java:426)
at
org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:360)
 at
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:201)
at
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:154)
 at
org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:241)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
ERROR [SSTableBatchOpen:2] 2013-05-29 04:38:40,085 CassandraDaemon.java
(line 175) Exception in thread Thread[SSTableBatchOpen:2,5,main]
java.lang.AssertionError
at
org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:401)
at
org.apache.cassandra.io.sstable.IndexSummary$IndexSummarySerializer.deserialize(IndexSummary.java:124)
 at
org.apache.cassandra.io.sstable.SSTableReader.loadSummary(SSTableReader.java:426)
at
org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:360)
 at
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:201)
at
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:154)
 at
org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:241)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
===

Re: Any experience of 20 node mini-itx cassandra cluster

2013-04-12 Thread Colin Blower

If you have not seen it already, checkout the Netflix blog post on their
performance testing of AWS SSD instances.

http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html

My guess, based on very little experience, is that you will be CPU bound.

On 04/12/2013 03:05 AM, Jabbar Azam wrote:
 Hello,

 I'm going to be building a 20 node cassandra cluster in one
 datacentre. The spec of the servers will roughly be dual core Celeron
 CPU, 256 GB SSD, 16GB RAM and two nics.


 Has anybody done any performance testing with this setup or have any
 gotcha's I should be aware of wrt to the hardware?

  I do realise the CPU is fairly low computational power but I'm going
 to assume the system is going to be IO bound hence the RAM and SSD's.


 Thanks

 Jabbar Azam


-- 
*Colin Blower*
/Software Engineer/
Barracuda Networks Inc.
+1 408-342-5576 (o)

Re: Creating a keyspace fails

2013-01-22 Thread Colin Blower

You were most likely looking at the wrong documentation. The syntax for 
CQL3 changed between Cassandra 1.1 and 1.2. When I google cassandra 
CQL3 the first result is Cassandra 1.1 documentation about CQL3, which 
is wrong for 1.2.


Make sure you are looking at the documentation for the version you are 
using. It might also be nice for DataStax to update the 1.1 
documentation with a warning.


--
*Colin Blower*


On 01/22/2013 04:06 AM, Paul van Hoven wrote:

Okay, that worked. Why is the statement from the tutorial wrong. I
mean, why would a company like datastax post somthing like this?

2013/1/22 Jason Wee peich...@gmail.com:

cqlsh CREATE KEYSPACE demodb WITH replication = {'class': 'SimpleStrategy',
'replication_factor': 3};
cqlsh use demodb;
cqlsh:demodb


On Tue, Jan 22, 2013 at 7:04 PM, Paul van Hoven
paul.van.ho...@googlemail.com wrote:

CREATE KEYSPACE demodb WITH strategy_class = 'SimpleStrategy'
AND strategy_options:replication_factor='1';

Re: Cassandra to Oracle?

2012-01-22 Thread Colin Clark

You don't have to use oracle and pay money, you can use postgresql for
example.

Triggers aren't that hard to implement.  We actually do.all of our
mutations now via triggers and we did it inside by effectivley overriding
the mutate logic itself.
On Jan 20, 2012 11:42 AM, Zach Richardson j.zach.richard...@gmail.com
wrote:

 How much data do you think you will need ad hoc query ability for?

 On Fri, Jan 20, 2012 at 11:28 AM, Brian O'Neill b...@alumni.brown.eduwrote:


 I can't remember if I asked this question before, but

 We're using Cassandra as our transactional system, and building up quite
 a library of map/reduce jobs that perform data quality analysis,
 statistics, etc.
 ( 100 jobs now)

 But... we are still struggling to provide an ad-hoc query mechanism for
 our users.

 To fill that gap, I believe we still need to materialize our data in an
 RDBMS.

 Anyone have any ideas?  Better ways to support ad-hoc queries?

 Effectively, our users want to be able to select count(distinct Y) from X
 group by Z.
 Where Y and Z are arbitrary columns of rows in X.

 We believe we can create column families with different key structures
 (using Y an Z as row keys), but some column names we don't know / can't
 predict ahead of time.

 Are people doing bulk exports?
 Anyone trying to keep an RDBMS in synch in real-time?

 -brian

 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://weblogs.java.net/blog/boneill42/
 blog: http://brianoneill.blogspot.com/

Re: Second Cassandra users survey

2011-11-07 Thread Colin Taylor

Decompression without compression (for lack of a better name).

We store into Cassandra log batches that come in over http either
uncompressed, deflate, snappy. We just add 'magic e.g. \0 \s \n \a \p
\p \y  as a prefix to the column value so we can decode it when serve
it back up.

Seems like Cassandra could detect data with the appropriate magic,
store as is and decode for us automatically on the way back.

Colin.

Re: [VOTE] Release Mojo's Cassandra Maven Plugin 0.8.6-1

2011-09-21 Thread Colin Taylor

+1 (non binding but lgtm)

On Wed, Sep 21, 2011 at 2:27 AM, Stephen Connolly
stephen.alan.conno...@gmail.com wrote:
 Hi,

 I'd like to release version 0.8.6-1 of Mojo's Cassandra Maven Plugin
 to sync up with the recent 0.8.6 release of Apache Cassandra.


 We solved 2 issues:
 http://jira.codehaus.org/secure/ReleaseNote.jspa?projectId=12121version=17425


 Staging Repository:
 https://nexus.codehaus.org/content/repositories/orgcodehausmojo-010/

 Site:
 http://mojo.codehaus.org/cassandra-maven-plugin/index.html

 SCM Tag:
 https://svn.codehaus.org/mojo/tags/cassandra-maven-plugin-0.8.6-1@14748

  [ ] +1 Yeah! fire ahead oh and the blind man on the galloping horse
 says it looks fine too.
  [ ] 0 Mehhh! like I care, I don't have any opinions either, I'd
 follow somebody else if only I could decide who
  [ ] -1 No! wait up there I have issues (in general like, ya know,
 and being a trouble-maker is only one of them)

 The vote is open for 72 hours and will succeed by lazy consensus.

 Cheers

 -Stephen

 P.S.
  In the interest of ensuring (more is) better testing, this vote is
 also open to subscribers of the dev and user@cassandra.apache.org
 mailing lists

Re: Not all data structures need timestamps (and don't require wasted memory).

2011-09-04 Thread Colin

Kevin,

You will find that many of us using cassanda are already doing what you suggest 
(custom serializer/deserializer).

We call it JSON.

--
Colin

*Sent from Star Trek like flat panel device, which although larger than my Star 
Trek like communicator device, may have typo's and exhibit improper grammar due 
to haste and less than perfect use of the virtual keyboard*
 

On Sep 4, 2011, at 12:11 AM, Kevin Burton bur...@spinn3r.com wrote:

 
 
 On Sat, Sep 3, 2011 at 8:53 PM, Jonathan Ellis jbel...@gmail.com wrote:
 I strongly suspect that you're optimizing prematurely.  What evidence
 do you have that timestamps are producing unacceptable overhead for
 your workload?  
 
 It's possible … this is back of the envelope at the moment as right now it's 
 a nonstarter.  
  
 You do realize that the sparse data model means that
 we spend a lot more than 8 bytes storing column names in-line with
 each column too, right?
 
 Yeah… this can be mitigated if the column names are your data.
  
 
 If disk space is really the limiting factor for your workload, I would
 recommend testing the compression code in trunk.  That will get you a
 lot farther than adding extra options for a very niche scenario.
 
 
 Another thing I've been considering is building a serializer/deserializer in 
 front of Cassandra and running my own protocol to talk to it which builds its 
 own encoding per row to avoid using excessive columns.
 
 Kevin 
 
 -- 
 Founder/CEO Spinn3r.com
 
 Location: San Francisco, CA
 Skype: burtonator
 Skype-in: (415) 871-0687

bring out your rpms...

2011-06-14 Thread Colin

Does anyone know where an rpm for 0.7.6-2 might be? (rhel)

 

I checked the datastax site and only see up to 0.7.6-1

RE: bring out your rpms...

2011-06-14 Thread Colin

Thanks Nate.  I appreciate it.

-Original Message-
From: Nate McCall [mailto:n...@datastax.com] 
Sent: Tuesday, June 14, 2011 4:52 PM
To: user@cassandra.apache.org
Subject: Re: bring out your rpms...

The 0.7.6-2 release was made over *-1 specifically to correct an issue with
debian packaging.

This keeps coming up though, so I'll probably just go ahead and roll a
0.7.6-2 for rpm.datastax.com so as not to confuse folks.

On Tue, Jun 14, 2011 at 4:19 PM, Colin colpcl...@gmail.com wrote:
 Does anyone know where an rpm for 0.7.6-2 might be? (rhel)

 I checked the datastax site and only see up to 0.7.6-1

Re: need some help with counters

2011-06-09 Thread Colin

Hey guy, have you tried amazon turk?

--
Colin Clark
+1 315 886 3422 cell
+1 701 212 4314 office
http://cloudeventprocessing.com
http://blog.cloudeventprocessing.com
@EventCloudPro

*Sent from Star Trek like flat panel device, which although larger than my Star 
Trek like communicator device, may have typo's and exhibit improper grammar due 
to haste and less than perfect use of the virtual keyboard*
 

On Jun 9, 2011, at 3:41 PM, Ian Holsman had...@holsman.net wrote:

 Hi.
 
 I had a brief look at CASSANDRA-2103 (expiring counter columns), and I was 
 wondering if anyone can help me with my problem.
 
 I want to keep some page-view stats on a URL at different levels of 
 granularity (page views per hour, page views per day, page views per year etc 
 etc).
 
 
 so my thinking was to create something a counter with a key based on 
 Year-Month-Day-Hour, and simply increment the counter as I go along. 
 
 this work's well and I'm getting my metrics beautifully put into the right 
 places.
 
 the only problem I have is that I only need the last 48-hours worth of 
 metrics at the hour level.
 
 how do I get rid of the old counters? 
 do I need to write a archiver that will go through each url (could be 
 millions) and just delete them?
 
 I'm sure other people have encountered this, and was wondering how they 
 approached it.
 
 TIA
 Ian

1 2 >

1 - 100 of 121 matches

Mail list logo