Re: best practices for time-series data with massive amounts of records

2015-03-06 Thread Clint Kelly
Hi all, Thanks for the responses, this was very helpful. I don't know yet what the distribution of clicks and users will be, but I expect to see a few users with an enormous amount of interactions and most users having very few. The idea of doing some additional manual partitioning, and then

best practices for time-series data with massive amounts of records

2015-03-02 Thread Clint Kelly
Hi all, I am designing an application that will capture time series data where we expect the number of records per user to potentially be extremely high. I am not sure if we will eclipse the max row size of 2B elements, but I assume that we would not want our application to approach that size

Re: how to scan all rows of cassandra using multiple threads

2015-02-25 Thread Clint Kelly
Hi Gaurav, I recommend you just run a MapReduce job for this computation. Alternatively, you can look at the code for the C* MapReduce input format: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlInputFormat.java That should give you what you need

Re: Why no virtual nodes for Cassandra on EC2?

2015-02-23 Thread Clint Kelly
Hi mck, I'm not familiar with this ticket, but my understanding was that performance of Hadoop jobs on C* clusters with vnodes was poor because a given Hadoop input split has to run many individual scans (one for each vnode) rather than just a single scan. I've run C* and Hadoop in production

Re: Running Cassandra + Spark on AWS - architecture questions

2015-02-23 Thread Clint Kelly
, Feb 20, 2015 at 10:17 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi all, I read the DSE 4.6 documentation and I'm still not 100% sure what a mixed workload Cassandra + Spark installation would look like, especially on AWS. What I gather is that you use OpsCenter to set up the following

Any notion of unions in C* user-defined types?

2015-02-23 Thread Clint Kelly
Hi all, I am building an application that keeps a time-series record of clickstream data (clicks, impressions, etc.). The data model looks something like: CREATE TABLE clickstream ( userid text, event_time timestamp, interaction frozen interaction_type, PRIMARY KEY (userid, timestamp) )

Running Cassandra + Spark on AWS - architecture questions

2015-02-20 Thread Clint Kelly
Hi all, I read the DSE 4.6 documentation and I'm still not 100% sure what a mixed workload Cassandra + Spark installation would look like, especially on AWS. What I gather is that you use OpsCenter to set up the following: - One virtual data center for real-time processing (e.g., ingestion

Re: Why no virtual nodes for Cassandra on EC2?

2015-02-20 Thread Clint Kelly
mean that paying a small efficiency cost when reading data out of Cassandra initially might not be the end of the world (especially given the benefits of using vnodes). On Fri, Feb 20, 2015 at 8:29 AM, Clint Kelly clint.ke...@gmail.com wrote: Hi Mark, Thanks for your reply. That makes sense. I

AMI to use to launch a cluster with OpsCenter on AWS

2015-02-20 Thread Clint Kelly
Hi all, I am trying to follow the instructions here for installing DSE 4.6 on AWS: http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/install/installAMIOpsc.html I was successful creating a single-node instance running OpsCenter, which I intended to bootstrap

Re: AMI to use to launch a cluster with OpsCenter on AWS

2015-02-20 Thread Clint Kelly
:36 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi all, I am trying to follow the instructions here for installing DSE 4.6 on AWS: http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/install/installAMIOpsc.html I was successful creating a single-node instance

Re: Why no virtual nodes for Cassandra on EC2?

2015-02-20 Thread Clint Kelly
and that type of workload. This is by no means a warning for users to disable vnodes on their Real-Time/Transactional Cassandra only clusters on EC2. I've used vnodes on EC2 without issue. Regards, Mark On 20 February 2015 at 05:08, Clint Kelly clint.ke...@gmail.com wrote: Hi all

Why no virtual nodes for Cassandra on EC2?

2015-02-19 Thread Clint Kelly
Hi all, The guide for installing Cassandra on EC2 says that Note: The DataStax AMI does not install DataStax Enterprise nodes with virtual nodes enabled. http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/install/installAMI.html Just curious why this is the case.

Re: No schema agreement from live replicas?

2015-02-03 Thread Clint Kelly
FWIW increasing the threshold for withMaxSchemaAgreementWaitSeconds to 30sec was enough to fix my problem---I would like to understand whether the cluster has some kind of configuration problem that made doing so necessary, however. Thanks! On Tue, Feb 3, 2015 at 7:44 AM, Clint Kelly clint.ke

No schema agreement from live replicas?

2015-02-03 Thread Clint Kelly
Hi all, I have an application that uses the Java driver to create a table and then immediately write to it. I see the following warning in my logs: [10.241.17.134] out: 15/02/03 09:32:24 WARN com.datastax.driver.core.Cluster: No schema agreement from live replicas after 10 s. The schema may not

Best practice for emulating a Cassandra timeout during unit tests?

2014-12-09 Thread Clint Kelly
Hi all, I'd like to write some tests for my code that uses the Cassandra Java driver to see how it behaves if there is a read timeout while accessing Cassandra. Is there a best-practice for getting this done? I was thinking about adjusting the settings in the cluster builder to adjust the

any way to get nodetool proxyhistograms data for an entire cluster?

2014-11-19 Thread Clint Kelly
If I run this tool on a given host, it shows me stats for only the cases where that host was the coordinator node, correct? Is there any way (other than me cooking up a little script) to automatically get the proxyhistogram stats for my entire cluster? -Clint

Re: any way to get nodetool proxyhistograms data for an entire cluster?

2014-11-19 Thread Clint Kelly
, 2014, at 8:48 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Nov 19, 2014 at 3:22 PM, Clint Kelly clint.ke...@gmail.com wrote: Is there any way (other than me cooking up a little script) to automatically get the proxyhistogram stats for my entire cluster? OpsCenter might expose

What time range does nodetool cfhistograms use?

2014-11-16 Thread Clint Kelly
Hi all, Over what time range does nodetool cfhistograms operate? I am using Cassandra 2.0.8.39. I am trying to debug some very high 95th and 99th percentile read latencies in an application that I'm working on. I tried running nodetool cfhistograms to get a flavor for the distribution of read

Best practices for route tracing

2014-11-16 Thread Clint Kelly
Hi all, I am trying to debug some high-latency outliers (99th percentile) in an application I'm working on. I thought that I could turn on route tracing, print the route traces to logs, and then examine my logs after a load test to find the highest-latency paths and figure out what is going on.

Re: What time range does nodetool cfhistograms use?

2014-11-16 Thread Clint Kelly
shown the latencies within a single host, or are they the end-to-end latencies from the coordinator node? -- cfhistograms shows metrics at table/node level, proxyhistograms shows metrics at cluster/coordinator level On Sun, Nov 16, 2014 at 10:31 PM, Clint Kelly clint.ke...@gmail.com wrote

best practice for waiting for schema changes to propagate

2014-09-29 Thread Clint Kelly
Hi all, I often have problems with code that I write that uses the DataStax Java driver to create / modify a keyspace or table and then soon after reads the metadata for the keyspace to verify that whatever changes I made the keyspace or table are complete. As an example, I may create a table

nondeterministic NoHostAvailableException occurs while dropping a table

2014-09-05 Thread Clint Kelly
Hi all, TL;DR - I think my unit tests are sometimes failing because of read timeouts to an EmbeddedCassandraService when dropping a table triggers a compaction on a highly-loaded build slave. Does this sound reasonable? What options should I change in my Cluster.Builder (or elsewhere) to prevent

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Clint Kelly
://github.com/Mishail/CqlJmeter -M On 8/17/14 12:26, Clint Kelly wrote: Hi all, Is there a way to use the cassandra-stress tool with clustering columns? I am trying to figure out whether an application that I'm running on is slow because of my application logic, C* data model, or underlying

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Clint Kelly
plugin may be useful in the latter case. https://github.com/Mishail/CqlJmeter -M On 8/17/14 12:26, Clint Kelly wrote: Hi all, Is there a way to use the cassandra-stress tool with clustering columns? I am trying to figure out whether an application that I'm running

cassandra-stress with clustering columns?

2014-08-17 Thread Clint Kelly
Hi all, Is there a way to use the cassandra-stress tool with clustering columns? I am trying to figure out whether an application that I'm running on is slow because of my application logic, C* data model, or underlying C* setup (e.g., I need more nodes or to tune some parameters). My

Re: cassandra-stress with clustering columns?

2014-08-17 Thread Clint Kelly
columns make a big difference in write performance? On Sun, Aug 17, 2014 at 12:26 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi all, Is there a way to use the cassandra-stress tool with clustering columns? I am trying to figure out whether an application that I'm running on is slow because

Re: question about OpsCenter agent

2014-08-15 Thread Clint Kelly
at the configuration options available to the datastax-agent see this page: datastax.com/documentation/opscenter/5.0/opsc/configure/agentAddressConfiguration.html Mark On Fri, Aug 15, 2014 at 3:32 AM, Clint Kelly clint.ke...@gmail.com wrote: Hi all, I just installed DataStax Enterprise 4.5. I

question about OpsCenter agent

2014-08-14 Thread Clint Kelly
Hi all, I just installed DataStax Enterprise 4.5. I installed OpsCenter Server on one of my four machines. The port that OpsCenter usually uses () was used by something else, so I modified /usr/share/opscenter/conf/opscenterd.conf to set the port to 8889. When I log into OpsCenter, it says

Re: Cassandra process exiting mysteriously

2014-08-12 Thread Clint Kelly
the shutdown lines in at least an hour before.. We're using C* 2.0.9. On Thu, Aug 7, 2014 at 12:49 AM, Clint Kelly clint.ke...@gmail.com wrote: Hi Rob, Thanks for the clarification; this is really useful. I'll run some experiments to see if the problem is a JVM OOM on our build machine

Re: Cassandra process exiting mysteriously

2014-08-06 Thread Clint Kelly
Hi Duncan, Thanks for your help. I am at a loss as to what is causing this process to stop then. I would not expect the Cassandra process to finish until my code calls Process#destroy, but it seems to non-deterministically stop much earlier sometimes. FWIW I have seen failures on another

Re: Cassandra process exiting mysteriously

2014-08-06 Thread Clint Kelly
Hi Rob, Thanks for the clarification; this is really useful. I'll run some experiments to see if the problem is a JVM OOM on our build machine. Best regards, Clint On Wed, Aug 6, 2014 at 1:14 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Aug 6, 2014 at 1:12 PM, Robert Coli

Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
Hi all, Allow me to rephrase a question I asked last week. I am performing some queries with ALLOW FILTERING and getting consistent read timeouts like the following: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (1 responses

Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
for your help! Best regards, Clint On Tue, Aug 5, 2014 at 10:54 AM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com wrote: Allow me to rephrase a question I asked last week. I am performing some queries with ALLOW FILTERING and getting

Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
, Clint Kelly clint.ke...@gmail.com wrote: Hi Rob, Thanks for your feedback. I understand that use of ALLOW FILTERING is not a best practice. In this case, however, I am building a tool on top of Cassandra that allows users to sometimes do things that are less than optimal. When they try to do

Cassandra process exiting mysteriously

2014-08-05 Thread Clint Kelly
Hi everyone, For some integration tests, we start up a CassandraDaemon in a separate process (using the Java 7 ProcessBuilder API). All of my integration tests run beautifully on my laptop, but one of them fails on our Jenkins cluster. The failing integration test does around 10k writes to

Re: Cassandra process exiting mysteriously

2014-08-05 Thread Clint Kelly
regards, Clint On Tue, Aug 5, 2014 at 9:29 PM, Kevin Burton bur...@spinn3r.com wrote: If there is an oom it will be in the logs. On Aug 5, 2014 8:17 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi everyone, For some integration tests, we start up a CassandraDaemon in a separate process

Re: Occasional read timeouts seen during row scans

2014-08-04 Thread Clint Kelly
: Saturday, August 2, 2014 7:04 AM To: user@cassandra.apache.org Subject: Re: Occasional read timeouts seen during row scans Hi Clint, is time correctly synchronized between your nodes? Ciao, Duncan. On 02/08/14 02:12, Clint Kelly wrote: BTW a few other details, sorry for omitting

Occasional read timeouts seen during row scans

2014-08-01 Thread Clint Kelly
Hi everyone, I am seeing occasional read timeouts during multi-row queries, but I'm having difficulty reproducing them or understanding what the problem is. First, some background: Our team wrote a custom MapReduce InputFormat that looks pretty similar to the DataStax InputFormat except that it

Re: Occasional read timeouts seen during row scans

2014-08-01 Thread Clint Kelly
was observing the timeout) Best regards, Clint On Fri, Aug 1, 2014 at 5:02 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi everyone, I am seeing occasional read timeouts during multi-row queries, but I'm having difficulty reproducing them or understanding what the problem is. First, some

Re: Index creation sometimes fails

2014-07-25 Thread Clint Kelly
Hi Tyler, FWIW I was not able to reproduce this problem with a smaller example. I'll go ahead and file the JIRA anyway. Thanks for your help! Best regards, Clint On Thu, Jul 17, 2014 at 3:05 PM, Tyler Hobbs ty...@datastax.com wrote: On Thu, Jul 17, 2014 at 4:59 PM, Clint Kelly clint.ke

Re: Index creation sometimes fails

2014-07-17 Thread Clint Kelly
JIRA, correct? Best regards, Clint On Wed, Jul 16, 2014 at 4:32 PM, Tyler Hobbs ty...@datastax.com wrote: On Tue, Jul 15, 2014 at 1:40 PM, Clint Kelly clint.ke...@gmail.com wrote: Is there some way to get the driver to block until the schema code has propagated everywhere? My currently

How to maintain the N-most-recent versions of a value?

2014-07-17 Thread Clint Kelly
Hi everyone, I am trying to design a schema that will keep the N-most-recent versions of a value. Currently my table looks like the following: CREATE TABLE foo ( rowkey text, family text, qualifier text, version long, value blob, PRIMARY KEY (rowkey, family, qualifier,

Re: Index creation sometimes fails

2014-07-15 Thread Clint Kelly
5 seconds This loop took three iterations to create the index. Is this expected? This seems really weird! Best regards, Clint On Mon, Jul 14, 2014 at 5:54 PM, Clint Kelly clint.ke...@gmail.com wrote: BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver on a three-node

Re: Index creation sometimes fails

2014-07-15 Thread Clint Kelly
, 2014 at 11:32 AM, DuyHai Doan doanduy...@gmail.com wrote: As far as I know, schema propagation always takes some times in the cluster. On this mailing list some people in the past faced similar behavior. On Tue, Jul 15, 2014 at 8:20 PM, Clint Kelly clint.ke...@gmail.com wrote: FWIW I was able

Index creation sometimes fails

2014-07-14 Thread Clint Kelly
Hi everyone, I have some code that I've been fiddling with today that uses the DataStax Java driver to create a table and then create a secondary index on a column in that table. I've testing this code fairly thoroughly on a single-node Cassandra instance on my laptop and in unit test (using the

Re: Index creation sometimes fails

2014-07-14 Thread Clint Kelly
BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver on a three-node cluster with DSE 4.5. On Mon, Jul 14, 2014 at 5:51 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi everyone, I have some code that I've been fiddling with today that uses the DataStax Java driver to create

Setting up DSE 4.5 for mixed workload with BYOH

2014-07-02 Thread Clint Kelly
Hi everyone, Apologies if this is the incorrect forum for a question like this. I am going to set up a mixed-workload (real-time and analytics) installation of DSE 4.5 using bring-your-own Hadoop (BYOH). We are using CDH 5.0. I was reviewing the installation instructions, and I came across the

Re: Setting up DSE 4.5 for mixed workload with BYOH

2014-07-02 Thread Clint Kelly
, you shouldn't enable vnodes on any Cassandra/DSE datacenter that is doing hadoop analytics workloads. Other DCs in the cluster can use vnodes. -Tupshin On Jul 2, 2014 5:50 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi everyone, Apologies if this is the incorrect forum for a question

Re: Setting up DSE 4.5 for mixed workload with BYOH

2014-07-02 Thread Clint Kelly
this would be an increase of several orders of magnitude in the number of input splits.) Best regards, Clint On Wed, Jul 2, 2014 at 6:04 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi Tupshin, Thanks for the quick reply. Is the performance concern from the Hadoop integration needing to set up

Re: Is the tarball for a given release in a Maven repository somewhere?

2014-05-22 Thread Clint Kelly
%3A%22apache-cassandra%22 On 05/20/2014 05:30 PM, Clint Kelly wrote: Hi all, I am using the maven assembly plugin to build a project that contains a development environment for a project that we've built at work on top of Cassandra. I'd like this development environment to include

Re: Is the tarball for a given release in a Maven repository somewhere?

2014-05-21 Thread Clint Kelly
Thanks, Lewis. I created a ticket here: https://issues.apache.org/jira/browse/CASSANDRA-7283 For now I just copied the cassandra and cassandra.in.sh scripts into my project, along with custom configuration files. We already have all of the necessary JARs in our project's lib directory, since

Is the tarball for a given release in a Maven repository somewhere?

2014-05-20 Thread Clint Kelly
Hi all, I am using the maven assembly plugin to build a project that contains a development environment for a project that we've built at work on top of Cassandra. I'd like this development environment to include the latest release of Cassandra. Is there a maven repo anywhere that contains an

Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-16 Thread Clint Kelly
Hi Anton, One approach you could look at is to write a custom InputFormat that allows you to limit the token range of rows that you fetch (if the AbstractColumnFamilyInputFormat does not do what you want). Doing so is not too much work. If you look at the class RowIterator within

Hadoop InputFormat that supports multiple queries

2014-05-12 Thread Clint Kelly
Hi everyone, I couple of months ago I started working on a new Hadoop InputFormat that we needed for something at my work. It is in a semi-working state now so I thought I would post a link in case anyone is interested: https://github.com/wibiclint/cassandra2-hadoop2 At the time I started

Re: Error evicting cold readers when launching an EmbeddedCassandraService for a second time

2014-05-02 Thread Clint Kelly
...@gmail.com wrote: Hello Clint Why do you need to remove all SSTables or dropping keyspace between tests ? Truncating tables is not enough to have clean and repeatable tests ? Regards Duy Hai DOAN On Thu, May 1, 2014 at 5:54 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi, I am deleting

Re: Error evicting cold readers when launching an EmbeddedCassandraService for a second time

2014-05-01 Thread Clint Kelly
the SSTables between tests ? I'm using extensively the same infrastructure than the EmbeddedCassandraService with Achilles and I have no such issue so far Regards On Wed, Apr 30, 2014 at 8:43 PM, Clint Kelly clint.ke...@gmail.comwrote: Hi all, I have a unit test framework

Error evicting cold readers when launching an EmbeddedCassandraService for a second time

2014-04-30 Thread Clint Kelly
Hi all, I have a unit test framework for a Cassandra project that I'm working on. For every one of my test classes, I delete all of the data file, commit log, and saved cache locations, start an EmbeddedCassandraService, and populate a keyspace and tables from scratch. Currently, the unit tests

Per-keyspace partitioners?

2014-04-09 Thread Clint Kelly
Hi everyone, Is there a way to change the partitioner on a per-table or per-keyspace basis? We have some tables for which we'd like to enable ordered scans of rows, so we'd like to use the ByteOrdered partitioner for those, but use Murmur3 for everything else in our cluster. Is this possible?

Re: Meaning of token column in system.peers and system.local

2014-03-31 Thread Clint Kelly
the inconsistency you think you found is because the first and second queries went to different nodes. the java driver will connect to all nodes and load balance requests by default. T# On Mon, Mar 31, 2014 at 4:06 AM, Clint Kelly clint.ke...@gmail.com wrote: BTW one other thing that I have

Re: Cassandra Chef cookbook - weird bug with broadcast_address: 10.0.2.15

2014-03-31 Thread Clint Kelly
[:ipaddress] is equal 10.0.2.15 hence your broadcast_address. You can setup networking in different way or setup attribute node[:cassandra][:broadcast_address] manually. On Mon, Mar 31, 2014 at 3:03 AM, Clint Kelly clint.ke...@gmail.com wrote: All, Has anyone used the Cassandra Chef

Meaning of token column in system.peers and system.local

2014-03-30 Thread Clint Kelly
Hi all, I am working on a Hadoop InputFormat implementation that uses only the native protocol Java driver and not the Thrift API. I am currently trying to replicate some of the behavior of *Cassandra.client.describe_ring(myKeyspace)* from the Thrift API. I would like to do the following:

Cassandra Chef cookbook - weird bug with broadcast_address: 10.0.2.15

2014-03-30 Thread Clint Kelly
All, Has anyone used the Cassandra Chef cookbook https://github.com/michaelklishin/cassandra-chef-cookbook and seen broadcast_address: 10.0.2.15 in /etc/cassandra/cassandra.yaml? I looked through the source code for the cookbook and I have no idea how this is happening. I was able to fix this

Re: Meaning of token column in system.peers and system.local

2014-03-30 Thread Clint Kelly
, Mar 30, 2014 at 4:51 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi all, I am working on a Hadoop InputFormat implementation that uses only the native protocol Java driver and not the Thrift API. I am currently trying to replicate some of the behavior of *Cassandra.client.describe_ring

How to tear down an EmbeddedCassandraService in unit tests?

2014-03-28 Thread Clint Kelly
All, I have a question about how to use the EmbeddedCassandraService in unit tests. I wrote a short collection of unit tests here: https://github.com/wibiclint/cassandra-java-driver-keyspaces I'm trying to start up a new EmbeddedCassandraService for each unit test. I looked at the Cassandra

Building a maven project that depends on Cassandra 2.0.6 or 2.1 beta?

2014-03-03 Thread Clint Kelly
Folks, Can anyone instruct me about how to set up a maven project that depends on either 2.0.6 or 2.1? I am interested in using some of the new features (e.g., static columns) in my current project. Being able to just install one of these versions in my local maven repository would be good

Re: CQL: Any way to have inequalities on multiple clustering columns in a WHERE clause?

2014-02-28 Thread Clint Kelly
' ALLOW FILTERING On Fri, Feb 28, 2014 at 6:57 AM, Clint Kelly clint.ke...@gmail.comwrote: All, Is there any way to have inequalities comparisons on multiple clustering columns in a WHERE clause in CQL? For example, I'd like to do: select * from foo where fam = 'Info' and qual 'A' and qual 'D

Re: Getting the most-recent version from time-series data

2014-02-28 Thread Clint Kelly
, this is done in parallel from the get-go. Fewer hops. Less load on the coordinator. No bottlenecks. And with a stored procedure, very very little additional overhead to the client, server, or network. -Tupshin On Tue, Feb 25, 2014 at 7:48 PM, Clint Kelly clint.ke...@gmail.comwrote: Hi everyone

Re: Combine multiple SELECT statements into one RPC?

2014-02-28 Thread Clint Kelly
27, 2014 at 1:00 AM, Clint Kelly clint.ke...@gmail.comwrote: Hi all, Is there any way to use the DataStax Java driver to combine multiple SELECT statements into a single RPC? I assume not (I could not find anything about this in the documentation), but I just wanted to check. The short

Re: Getting the most-recent version from time-series data

2014-02-28 Thread Clint Kelly
to indicate (to our software that sits on top of C*) that they are going to use paging, and then we are going to be doing multiple client / server operations anyway. I'd just like to minimize them. :) Best regards, Clint On Fri, Feb 28, 2014 at 9:47 AM, Clint Kelly clint.ke...@gmail.com wrote: Hi

Any way to get a list of per-node token ranges using the DataStax Java driver?

2014-02-28 Thread Clint Kelly
Hi everyone, I've been working on a rewrite of the Cassandra InputFormat for Hadoop 2 using the DataStax Java driver instead of the Thrift API. I have a prototype working now, but there is one bit of code that I have not been able to replace with code for the Java driver. In the

Re: Resetting a counter in CQL

2014-02-28 Thread Clint Kelly
Great, thanks! On Fri, Feb 28, 2014 at 4:38 PM, Tyler Hobbs ty...@datastax.com wrote: On Fri, Feb 28, 2014 at 6:32 PM, Clint Kelly clint.ke...@gmail.comwrote: What is the best known method for resetting a counter in CQL? Is it best to read the counter and then increment it by a negative

Re: Naming variables in a prepared statement in the DataStax Java driver

2014-02-27 Thread Clint Kelly
Ah never mind, I see, currently you can refer to the ?'s by name by using the name of the column to which the ? refers. And this works as long as each column is present only one in the statement. Sorry for the extra list traffic! On Thu, Feb 27, 2014 at 7:33 PM, Clint Kelly clint.ke

CQL: Any way to have inequalities on multiple clustering columns in a WHERE clause?

2014-02-27 Thread Clint Kelly
All, Is there any way to have inequalities comparisons on multiple clustering columns in a WHERE clause in CQL? For example, I'd like to do: select * from foo where fam = 'Info' and qual 'A' and qual 'D' and version 2013 ALLOW FILTERING; I get an error: Bad Request: PRIMARY KEY part

Re: Update multiple rows in a CQL lightweight transaction

2014-02-26 Thread Clint Kelly
:20 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi Tupshin, Thanks for your help! Unfortunately in my case, I will need to do a compare and set in which the compare is against a value in a dynamic column. In general, I need to be able to do the following: - Check whether a given value

Combine multiple SELECT statements into one RPC?

2014-02-26 Thread Clint Kelly
Hi all, Is there any way to use the DataStax Java driver to combine multiple SELECT statements into a single RPC? I assume not (I could not find anything about this in the documentation), but I just wanted to check. Thanks! Best regards, Clint

Re: Update multiple rows in a CQL lightweight transaction

2014-02-25 Thread Clint Kelly
column (coming with 2.0.6) as your conditional flag, as that column is shared by all rows in the partition. -Tupshin On Mon, Feb 24, 2014 at 3:57 PM, Clint Kelly clint.ke...@gmail.comwrote: Hi Tupshin, Thanks for your help; I appreciate it. Could I do something like the following? Given

Re: Getting the most-recent version from time-series data

2014-02-25 Thread Clint Kelly
On Feb 25, 2014, at 7:49 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi everyone, Let's say that I have a table that looks like the following: CREATE TABLE time_series_stuff ( key text, family text, version int, val text, PRIMARY KEY (key, family, version

Re: Update multiple rows in a CQL lightweight transaction

2014-02-24 Thread Clint Kelly
The Resolution status of the JIRA is set to Later, probably the implementation is not done yet. The JIRA was opened to discuss about impl strategy but nothing has been coded so far I guess. On Sat, Feb 22, 2014 at 12:02 AM, Clint Kelly clint.ke...@gmail.com wrote: Folks, Does anyone know how I can

Update multiple rows in a CQL lightweight transaction

2014-02-21 Thread Clint Kelly
Folks, Does anyone know how I can modify multiple rows at once in a lightweight transaction in CQL3? I saw the following ticket: https://issues.apache.org/jira/browse/CASSANDRA-5633 but it was not obvious to me from the comments how (or whether) this got resolved. I also couldn't find

Buffering for lots of INSERT or UPDATE calls with DataStax Java driver?

2014-02-08 Thread Clint Kelly
Folks, Is there a recommended way to perform lots of INSERT operations in a row when using the DataStax Java driver? I notice that the RecordWriter for the CQL3 Hadoop implementation in Cassandra does some per-data-node buffering of CQL3 queries. The DataStax Java driver, on the other hand,

Re: Buffering for lots of INSERT or UPDATE calls with DataStax Java driver?

2014-02-08 Thread Clint Kelly
Java driver? -- Yes, use UNLOGGED batches. More info here: http://www.datastax.com/documentation/cql/3.0/webhelp/index.html#cql/cql_reference/batch_r.html On Sat, Feb 8, 2014 at 10:19 PM, Clint Kelly clint.ke...@gmail.com wrote: Folks, Is there a recommended way to perform lots of INSERT

CQL3 delete using or ?

2014-02-08 Thread Clint Kelly
Folks, Is there any way to perform a delete in CQL of all rows where a particular columns (that is part of the primary key) is less than a certain value? I believe that the corresponding SELECT statement works, as in this example: cqlsh:fiddle describe table foo; CREATE TABLE foo ( key text,

Re: Cassandra 2.0 with Hadoop 2.x?

2014-02-06 Thread Clint Kelly
, at 19:10, Clint Kelly clint.ke...@gmail.com wrote: Folks, Has anyone out there used Cassandra 2.0 with Hadoop 2.x? I saw this discussion on the Cassandra JIRA: https://issues.apache.org/jira/browse/CASSANDRA-5201 but the fix referenced (https://github.com/michaelsembwever

Cassandra 2.0 with Hadoop 2.x?

2014-02-03 Thread Clint Kelly
Folks, Has anyone out there used Cassandra 2.0 with Hadoop 2.x? I saw this discussion on the Cassandra JIRA: https://issues.apache.org/jira/browse/CASSANDRA-5201 but the fix referenced (https://github.com/michaelsembwever/cassandra-hadoop) is for Cassandra 1.2. I put together a similar