RE: Supported Cassandra version for CentOS 5.5

2014-02-26 Thread Arindam Barua

I am running Cassandra 1.2.12 on CentOS 5.10.
Was running 1.1.15 previously without any issues as well.

-Arindam

From: Donald Smith [mailto:donald.sm...@audiencescience.com]
Sent: Tuesday, February 25, 2014 3:40 PM
To: user@cassandra.apache.org
Subject: RE: Supported Cassandra version for CentOS 5.5


I was unable to get cassandra working with CentOS 5.X . I needed to use CentOS 
6.2 or 6.4.



Don


From: Hari Rajendhran hari.rajendh...@tcs.commailto:hari.rajendh...@tcs.com
Sent: Tuesday, February 25, 2014 2:34 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Supported Cassandra version for CentOS 5.5

Hi,

Currently i am using CentOS 5.5 OS.I need a clarification on the latest 
cassandra version(preferably 2.0.4) that my OS supports.



Best Regards
Hari Krishnan Rajendhran
Hadoop Admin
DESS-ABIM ,Chennai BIGDATA Galaxy
Tata Consultancy Services
Cell:- 9677985515
Mailto: hari.rajendh...@tcs.commailto:hari.rajendh...@tcs.com
Website: http://www.tcs.comhttp://www.tcs.com/

Experience certainty.IT Services
   Business Solutions
   Consulting


=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you


RE: Bootstrap stuck: vnode enabled 1.2.12

2014-02-26 Thread Arindam Barua

As an update - finally got the node to join the ring.

Restarting all the nodes in the cluster, followed by a clean bootstrap of the 
node that was stuck did the trick.

-Arindam

From: Arindam Barua [mailto:aba...@247-inc.com]
Sent: Monday, February 24, 2014 5:04 PM
To: user@cassandra.apache.org
Subject: RE: Bootstrap stuck: vnode enabled 1.2.12


The host would not join the ring after more clean bootstrap attempts.

Noticed nodetool netstats, even though doesn't repair any streaming, does 
constantly report Nothing streaming from 3 specific hosts in the ring.

$ nodetool netstats
xss =  -ea -d64 -javaagent:/usr/local/cassandra/bin/../lib/jamm-0.2.5.jar 
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms8043M -Xmx8043M 
-Xmn800M -XX:+HeapDumpOnOutOfMemoryError -Xss256k
Mode: JOINING
Not sending any streams.
Nothing streaming from /10.67.XXX.XXX
Nothing streaming from /10.67.XXX.XXX
Nothing streaming from /10.67.XXX.XXX

Today when I had to do some unrelated maintenance and attempted to drain the 
hosts mentioned above before restarting cassandra, the drain would just hang. 
Other hosts in the ring did not have any issue.
Also the original host that is stuck in the joining state, logged the following:

[24/02/2014:15:49:42 PST] GossipTasks:1: ERROR AbstractStreamSession.java (line 
110) Stream failed because /10.67.XXX.XXX died or was restarted/removed 
(streams may still be active in background, but further streams won't be 
started)
[24/02/2014:15:49:42 PST] GossipTasks:1:  WARN RangeStreamer.java (line 246) 
Streaming from /10.67.XXX.XXX failed


From: Arindam Barua [mailto:aba...@247-inc.com]
Sent: Tuesday, February 18, 2014 5:16 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: RE: Bootstrap stuck: vnode enabled 1.2.12


I believe you are talking about CASSANDRA-6685, which was introduced in 1.2.15.

I'm trying to add a node to a production ring. I have added nodes previously 
just fine. However, this node had hardware issues during a previous bootstrap, 
and now even a clean bootstrap seems to be having problems. Does the ring 
somehow remember about this node and if so can I make it forget about it? 
Decommission/removenode does not work on a node that hasn't yet bootstrapped.

From: Edward Capriolo [mailto:edlinuxg...@gmail.com]
Sent: Tuesday, February 18, 2014 12:30 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Bootstrap stuck: vnode enabled 1.2.12

There is a bug where a node without schema can not bootstrap. Do you have 
schema?

On Tue, Feb 18, 2014 at 1:29 PM, Arindam Barua 
aba...@247-inc.commailto:aba...@247-inc.com wrote:

The node is still out of the ring. Any suggestions on how to get it in will be 
very helpful.

From: Arindam Barua [mailto:aba...@247-inc.commailto:aba...@247-inc.com]
Sent: Friday, February 14, 2014 1:04 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Bootstrap stuck: vnode enabled 1.2.12


After our otherwise successful upgrade procedure to enable vnodes, when adding 
back new hosts to our cluster, one non-seed host ran into a hardware issue 
during bootstrap. By the time the hardware issue was fixed a week later, all 
other nodes were added successfully, cleaned, repaired. The disks on this node 
were untouched, and when the node was started back up, it detected an 
interrupted bootstrap, and attempted to bootstrap. However, after ~24 hrs it 
was still stuck in the 'JOINING' state according to nodetool netstats on that 
node, even though no streams were flowing to/from it. Also, it did not appear 
in nodetool status in any way/form (not even as JOINING).

From couple of observed thread dumps, the stack of the thread blocked during 
bootstrap is at [1].

Since the node wasn't making any progress, I ended up stopping Cassandra, 
cleaning up the data and commitlog directories, and attempted a fresh 
bootstrap. Nodetool netstats immediately reported a whole bunch of streams 
queued up, and data started streaming to the node. The data directory quickly 
grew to 18 GB (the other nodes had ~25GB, but we have lot of data with low 
TTLs). However, the node ended up being in the earlier reported state, i.e. 
nodetool netstats doesn't have anything queued, but still reports the JOINING 
state, even though it's been  24 hrs. There are no other ERRORS in the logs, 
and new data being written to the cluster makes it to this node just fine, 
triggering compactions, etc from time to time.

Any help is appreciated.

Thanks,
Arindam
[1] Thread dump
Thread 3708: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may
   be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14,
   line=156 (Interpreted frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt()
   @bci=1, line=811 (Interpreted frame)
 -
   
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(int)
   

FW: Sporadic gossip exception on add node

2014-02-26 Thread Desimpel, Ignace
Did not help finally.

So I enabled  logging at debug level.
The log files tell me that the node being added is communicating with the other 
nodes (that are seed nodes). Still nothing seems to be returning to that node.
The log files on the other nodes are detecting the shadow request, but no other 
information like being unable to send something back.
And as before, just restarting that node once more does the trick and bootstrap 
is proceeding.

Maybe the problem has something to do about the gossip state?
My test case is : Decommission node 10.164.8.93, restart a clean node 
10.164.8.93 and let it bootstrap.

In my test case, I see that 7 minutes before the node is added again to the 
ring the other nodes are detecting the decommission of the node.

2014-02-26 10:23:21.443 6 elapsed, /10.164.8.93 gossip quarantine over
2014-02-26 10:23:21.444 Ignoring state change for dead or unknown endpoint: 
/10.164.8.93
2014-02-26 10:23:55.636 Forcing conviction of /10.164.8.93
2014-02-26 10:24:00.230 Reseting version for /10.164.8.93
2014-02-26 10:24:00.230 Reseting version for /10.164.8.93


At the time the node 10.164.8.93 is added the log shows :


2014-02-26 10:30:03.015 Cassandra version: 2.0.5-SNAPSHOT
2014-02-26 10:30:03.016 Thrift API version: 19.39.0
2014-02-26 10:30:03.018 CQL supported versions: 2.0.0,3.1.4 (default: 3.1.4)
2014-02-26 10:30:03.029 Loading persisted ring state
2014-02-26 10:30:03.034 Starting shadow gossip round to check for endpoint 
collision
2014-02-26 10:30:03.034 Starting Messaging Service on port 9804
2014-02-26 10:30:03.046 attempting to connect to /10.164.8.249
2014-02-26 10:30:03.047 attempting to connect to /10.164.8.250
2014-02-26 10:30:03.048 attempting to connect to /10.164.8.92
2014-02-26 10:30:03.051 Handshaking version with /10.164.8.250
2014-02-26 10:30:03.052 Handshaking version with /10.164.8.249
2014-02-26 10:30:03.052 Handshaking version with /10.164.8.92
2014-02-26 10:30:34.059 Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at 
org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1173) 
~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:424)
 ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:615)
 ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:583) 
~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:482) 
~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) 
~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:465) 
~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at 
be.landc.services.search.server.db.baseserver.indexsearch.store.cassandra.CassandraStore$CassThread.startUpCassandra(CassandraStore.java:495)
 [landc-services-search-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT-92200]
at 
be.landc.services.search.server.db.baseserver.indexsearch.store.cassandra.CassandraStore$CassThread.run(CassandraStore.java:461)
 [landc-services-search-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT-92200]
2014-02-26 10:30:34.069 Exception in thread 
Thread[StorageServiceShutdownHook,5,main]
java.lang.NullPointerException: null
at org.apache.cassandra.gms.Gossiper.stop(Gossiper.java:1250) 
~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:550)
 ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_40]
2014-02-26 10:31:34.081 ShutDownHook requests shutdown on 
be.landc.framework.service.ipl.Boot@730e8516


And at that time 3 other  nodes print log information :
-

2014-02-26 10:30:03.051 Connection version 7 from /10.164.8.93
2014-02-26 10:30:03.065 Upgrading incoming connection to be compressed
2014-02-26 10:30:03.130 Max version for /10.164.8.93 is 7
2014-02-26 10:30:03.130 Setting version 7 for /10.164.8.93
2014-02-26 10:30:03.131 set version for /10.164.8.93 to 7
2014-02-26 10:30:03.131 Shadow request received, adding all states


Any more information I can pass?
Regards,
Ignace

From: Desimpel, Ignace 

Cassandra nodetool status result after restoring snapshot

2014-02-26 Thread Ranking Lekarzy
Hi

I have two separated clusters consist of 4 nodes. One cluster is running on
1.2.12 and the other one on 2.0.5. I loaded data from first cluster
(1.2.12) to the second one (2.0.5) by copying snapshots between
corresponding nodes. I removed commitlogs, started second cluster and run
nodetool upgradesstables.
After this I expect that nodetool status will give me the same results in
Load column on both clusters. Unfortunately it is completely different:
- old cluster: [728.02 GB, 558.24 GB, 787.08 GB, 555.1 GB]
- new cluster: [14.63 GB, 35.98 GB, 18 GB, 38.39 GB]

When I briefly check data on new cluster it looks fine. But I'm worry about
this difference. Do you have any idea what does it mean?

Thanks,
Michu


Re: CQL decimal encoding

2014-02-26 Thread Peter Lin

You may need to bit shift if that is the case

Sent from my iPhone

 On Feb 26, 2014, at 2:53 AM, Ben Hood 0x6e6...@gmail.com wrote:
 
 Hey Colin,
 
 On Tue, Feb 25, 2014 at 10:26 PM, Colin Blower cblo...@barracuda.com wrote:
 It looks like you are trying to implement the Decimal type. You might want
 to start with implementing the Integer type. The Decimal type follows pretty
 easily from the Integer type.
 
 For example:
 i = unmarchalInteger(data[4:])
 s = decInt(data[0:4])
 out = inf.newDec(i, s)
 
 Thanks for the suggestion.
 
 This is pretty much what I've got already. I think the issue might be
 to do with the way that big.Int doesn't appear to use two's complement
 to encode the varint. Maybe what is happening is that the encoding is
 isomorphic across say Java, .NET, Python and Ruby, but that the
 big.Int library in Go is not encoding in the same way.
 
 Cheers,
 
 Ben


Re: CQL decimal encoding

2014-02-26 Thread Laing, Michael
go uses 'zig-zag' encoding, perhaps that is the difference?


On Wed, Feb 26, 2014 at 6:52 AM, Peter Lin wool...@gmail.com wrote:


 You may need to bit shift if that is the case

 Sent from my iPhone

  On Feb 26, 2014, at 2:53 AM, Ben Hood 0x6e6...@gmail.com wrote:
 
  Hey Colin,
 
  On Tue, Feb 25, 2014 at 10:26 PM, Colin Blower cblo...@barracuda.com
 wrote:
  It looks like you are trying to implement the Decimal type. You might
 want
  to start with implementing the Integer type. The Decimal type follows
 pretty
  easily from the Integer type.
 
  For example:
  i = unmarchalInteger(data[4:])
  s = decInt(data[0:4])
  out = inf.newDec(i, s)
 
  Thanks for the suggestion.
 
  This is pretty much what I've got already. I think the issue might be
  to do with the way that big.Int doesn't appear to use two's complement
  to encode the varint. Maybe what is happening is that the encoding is
  isomorphic across say Java, .NET, Python and Ruby, but that the
  big.Int library in Go is not encoding in the same way.
 
  Cheers,
 
  Ben



Re: Getting the most-recent version from time-series data

2014-02-26 Thread Tupshin Harper
And one last clarification. Where I said stored procedure earlier, I
meant prepared statement. Sorry for the confusion. Too much typing while
tired.

-Tupshin


On Tue, Feb 25, 2014 at 10:36 PM, Tupshin Harper tups...@tupshin.comwrote:

 I failed to address the matter of not knowing the families in advance.

 I can't really recommend any solution to that other than storing the list
 of families in another structure that is readily queryable. I don't know
 how many families you are thinking, but if it is in the millions or more,
 You might consider constructing another table such as:
 CREATE TABLE families (
   key int,
   family text,
   PRIMARY KEY (key, family)
 );


 store your families there, with a knowable set of keys (I suggest
 something like the last 3 digits of the md5 hash of the family). So then
 you could retrieve your families in nice sized batches
 SELECT family FROM id WHERE key=0;
 and then do the fan-out selects that I described previously.

 -Tupshin


 On Tue, Feb 25, 2014 at 10:15 PM, Tupshin Harper tups...@tupshin.comwrote:

 Hi Clint,

 What you are describing could actually be accomplished with the Thrift
 API and a multiget_slice with a slicerange having a count of 1. Initially I
 was thinking that this was an important feature gap between Thrift and CQL,
 and was going to suggest that it should be implemented (possible syntax is
 in https://issues.apache.org/jira/browse/CASSANDRA-6167 which is almost
 a superset of this feature).

 But then I was convinced by some colleagues, that with a modern CQL
 driver that is token aware, you are actually better off (in terms of
 latency, throughput, and reliability), by doing each query separately on
 the client.

 The reasoning is that if you did this with a single query, it would
 necessarily be sent to a coordinator that wouldn't own most of the data
 that you are looking for. That coordinator would then need to fan out the
 read to all the nodes owning the partitions you are looking for.

 Far better to just do it directly on the client. The token aware client
 will send each request for a row straight to a node that owns it. With a
 separate connection open to each node, this is done in parallel from the
 get-go. Fewer hops. Less load on the coordinator. No bottlenecks. And with
 a stored procedure, very very little additional overhead to the client,
 server, or network.

 -Tupshin


 On Tue, Feb 25, 2014 at 7:48 PM, Clint Kelly clint.ke...@gmail.comwrote:

 Hi everyone,

 Let's say that I have a table that looks like the following:

 CREATE TABLE time_series_stuff (
   key text,
   family text,
   version int,
   val text,
   PRIMARY KEY (key, family, version)
 ) WITH CLUSTERING ORDER BY (family ASC, version DESC) AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};

 cqlsh:fiddle select * from time_series_stuff ;

  key| family  | version | val
 +-+-+
  monday | revenue |   3 | $$
  monday | revenue |   2 |$$$
  monday | revenue |   1 | $$
  monday | revenue |   0 |  $
  monday | traffic |   2 | medium
  monday | traffic |   1 |  light
  monday | traffic |   0 |  heavy

 (7 rows)

 Now let's say that I'd like to perform a query that gets me the most
 recent N versions of revenue and traffic.

 Is there a CQL query to do this?  Let's say that N=1.  Then I know that
 I can do:

 cqlsh:fiddle select * from time_series_stuff where key='monday' and
 family='revenue' limit 1;

  key| family  | version | val
 +-+-+
  monday | revenue |   3 | $$

 (1 rows)

 cqlsh:fiddle select * from time_series_stuff where key='monday' and
 family='traffic' limit 1;

  key| family  | version | val
 +-+-+
  monday | traffic |   2 | medium

 (1 rows)

 But what if I have lots of families and I want to get the most recent
 N versions of all of them in a single CQL statement.  Is that possible?
 Unfortunately I am working on something where the family names and the
 number of most-recent versions are not known a priori (I am porting some
 code that was designed for HBase).

 Best regards,
 Clint






Cassandra Java Client

2014-02-26 Thread Timmy Turner
Hi,

is the DataStax Java Driver for Apache Cassandra (
https://github.com/datastax/java-driver) the official/recommended Java
Client to use for accessing Cassandra from Java?

Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any
CQL clients?


Thanks!


Re: Cassandra Java Client

2014-02-26 Thread DuyHai Doan
Short answer : yes

Long anwser:  depending on whether you want to access Cassandra using
Thrift of the native CQL3 protocole, different options are available. For
Thrift access, lots of choices (Hector, Astyanax...). For CQL3 right now
the only Java client so far is the one provided by Datastax

Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any
CQL clients?

 No, the apache jars only ship the server-related components.




On Wed, Feb 26, 2014 at 2:00 PM, Timmy Turner timm.t...@gmail.com wrote:

 Hi,

 is the DataStax Java Driver for Apache Cassandra (
 https://github.com/datastax/java-driver) the official/recommended Java
 Client to use for accessing Cassandra from Java?

 Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any
 CQL clients?


 Thanks!



Re: Naive question about orphan rows

2014-02-26 Thread Edward Capriolo
It is probably ok to have redundant songs in playlists, cassandra is about
denormalization.

Dealing with this issue is going to be hard since the only way to dwal with
this would be scanning through the firsr cf and procing counts then using
that information to delete in the second table. However that information
can change rapidly and then will fall out of sink fast.

The only ways yo handle this are

1) never delete songs
2) store copies of songs ib playlist

On Friday, February 21, 2014, Green, John M (HP Education) 
john.gr...@hp.com wrote:
 I'm very much a newbie so this may be a silly question but ...



 I have a situation similar to the music service example (
http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html)
of songs and playlists.  However, in my case, the songs would be
considered orphans that should be deleted when no playlists refer to
them.  Relational databases have mechanisms to manage this relationship so
that a song could be deleted as soon as the last playlist referencing
it is deleted.While I do NOT need to manage this as an atomic
transaction, I'm wondering what is the best way to delete orphaned rows
(i.e., songs not referenced by any playlists)  using Cassandra.



 I guess an alternative approach would be to store songs directly in the
playlists but this could lead to many redundant copies of the same song
which is something I'm hoping to avoid.  I'm my case the playlists could
have thousands of entries and the songs might be blobs of 10s of
Mbytes.Maybe I'm just having a hard time abandoning my relational roots?



 John

-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.


Re: Cassandra nodetool status result after restoring snapshot

2014-02-26 Thread Jonathan Lacefield
Hello,

  That's a large variation between the old and new cluster.  Are you sure
you pulled over all the SSTables for your keyspaces?  Also, did you run a
repair after the data move?  Do you have a lot of tombstone data in the old
cluster that was removed during the migration process?  Are you using
Opscenter?

  A quick comparison of cfstats between clusters may help you analyze your
situation and help you pinpoint if you are missing any data for a
particular keyspace, etc as well.

Thanks,

Jonathan

Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487
http://www.linkedin.com/in/jlacefield


http://www.datastax.com/what-we-offer/products-services/training/virtual-training


On Wed, Feb 26, 2014 at 6:07 AM, Ranking Lekarzy 
rankinglekarzy@gmail.com wrote:

 Hi

 I have two separated clusters consist of 4 nodes. One cluster is running
 on 1.2.12 and the other one on 2.0.5. I loaded data from first cluster
 (1.2.12) to the second one (2.0.5) by copying snapshots between
 corresponding nodes. I removed commitlogs, started second cluster and run
 nodetool upgradesstables.
 After this I expect that nodetool status will give me the same results in
 Load column on both clusters. Unfortunately it is completely different:
 - old cluster: [728.02 GB, 558.24 GB, 787.08 GB, 555.1 GB]
 - new cluster: [14.63 GB, 35.98 GB, 18 GB, 38.39 GB]

 When I briefly check data on new cluster it looks fine. But I'm worry
 about this difference. Do you have any idea what does it mean?

 Thanks,
 Michu



Re: Cassandra Java Client

2014-02-26 Thread Vivek Mishra
Kundera does support CQL3. Work for supporting datastax java driver is
under development.

https://github.com/impetus-opensource/Kundera/issues/385

-Vivek


On Wed, Feb 26, 2014 at 6:34 PM, DuyHai Doan doanduy...@gmail.com wrote:

 Short answer : yes

 Long anwser:  depending on whether you want to access Cassandra using
 Thrift of the native CQL3 protocole, different options are available. For
 Thrift access, lots of choices (Hector, Astyanax...). For CQL3 right now
 the only Java client so far is the one provided by Datastax

 Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any
 CQL clients?

  No, the apache jars only ship the server-related components.




 On Wed, Feb 26, 2014 at 2:00 PM, Timmy Turner timm.t...@gmail.com wrote:

 Hi,

 is the DataStax Java Driver for Apache Cassandra (
 https://github.com/datastax/java-driver) the official/recommended Java
 Client to use for accessing Cassandra from Java?

 Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any
 CQL clients?


 Thanks!





Flushing after dropping a column family

2014-02-26 Thread Ben Hood
Hi,

I'm trying to truncate data on a single node 2.0.5 instance and I'm
noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to
leave the underlying data behind.

So I was wondering what nodetool operation I should use to completely
nuke the old data, short of dropping the entire keyspace.

Cheers,

Ben


Re: Flushing after dropping a column family

2014-02-26 Thread DuyHai Doan
I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to
leave the underlying data behind.

 -- What do you mean by underlying data ? Are you talking about
snapshots ?

If yes, you can wipe them using nodetool clearsnapshots command



On Wed, Feb 26, 2014 at 4:14 PM, Ben Hood 0x6e6...@gmail.com wrote:

 Hi,

 I'm trying to truncate data on a single node 2.0.5 instance and I'm
 noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to
 leave the underlying data behind.

 So I was wondering what nodetool operation I should use to completely
 nuke the old data, short of dropping the entire keyspace.

 Cheers,

 Ben



RE: Naive question about orphan rows

2014-02-26 Thread Green, John M (HP Education)
Edward,

Thanks for your insight.

One other thought I had was to store a reference count with the song.  When 
the last playlist referencing the song is deleted the song will also be 
deleted because the reference count decrements to zero.   However, this would 
create some nastiness when it comes to reliably maintaining reference counts.   
I'm not sure if it would help to split the reference count into two 
monotonically increasing counters (number of references added, and number of 
references deleted).

In my case, users cannot browse a repository of songs to build a playlist 
from scratch.  They can only import songs themselves or create references to 
songs other users have explicitly made available to them.  Once a song is 
not referred to by any playlist it will never be re-discovered so it should 
be deleted.   This could be done in some sort of background data maintenance 
job that runs periodically.   Even if it is a low-priority background job it 
look like it will create a lot overhead (scanning and producing counts).

John
From: Edward Capriolo [mailto:edlinuxg...@gmail.com]
Sent: Wednesday, February 26, 2014 5:56 AM
To: user@cassandra.apache.org
Subject: Re: Naive question about orphan rows

It is probably ok to have redundant songs in playlists, cassandra is about 
denormalization.

Dealing with this issue is going to be hard since the only way to dwal with 
this would be scanning through the firsr cf and procing counts then using that 
information to delete in the second table. However that information can change 
rapidly and then will fall out of sink fast.

The only ways yo handle this are

1) never delete songs
2) store copies of songs ib playlist

On Friday, February 21, 2014, Green, John M (HP Education) 
john.gr...@hp.commailto:john.gr...@hp.com wrote:
 I'm very much a newbie so this may be a silly question but ...



 I have a situation similar to the music service example 
 (http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html)
  of songs and playlists.  However, in my case, the songs would be 
 considered orphans that should be deleted when no playlists refer to them.  
 Relational databases have mechanisms to manage this relationship so that a 
 song could be deleted as soon as the last playlist referencing it is 
 deleted.While I do NOT need to manage this as an atomic transaction, I'm 
 wondering what is the best way to delete orphaned rows (i.e., songs not 
 referenced by any playlists)  using Cassandra.



 I guess an alternative approach would be to store songs directly in the 
 playlists but this could lead to many redundant copies of the same song 
 which is something I'm hoping to avoid.  I'm my case the playlists could 
 have thousands of entries and the songs might be blobs of 10s of Mbytes.
 Maybe I'm just having a hard time abandoning my relational roots?



 John

--
Sorry this was sent from mobile. Will do less grammar and spell check than 
usual.


Re: Flushing after dropping a column family

2014-02-26 Thread Ben Hood
On Wed, Feb 26, 2014 at 3:17 PM, DuyHai Doan doanduy...@gmail.com wrote:
 I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to
 leave the underlying data behind.

  -- What do you mean by underlying data ? Are you talking about
 snapshots ?

I was referring to all of the state related to the particular column
family I want to set fire to, be it snapshots, parts of commit logs,
sstables, key caches, row caches, or anything else on or off disk that
relates to said column family.

 If yes, you can wipe them using nodetool clearsnapshots command

This is what I'm doing:

cqlsh:bar drop table foo;

$ nodetool clearsnapshot bar
Requested clearing snapshot for: bar

cqlsh:bar create table foo ();
cqlsh:bar select * from foo limit 1;

This returns nothing (as you would expect).

But this if I re-run this again after about a minute, the data is back.

I get the same behavior if I use nodetool cleanup, flush, compact or repair.

It's as if there is either a background app process filling the table
up again or if the deletion hasn't taken place.


Re: Flushing after dropping a column family

2014-02-26 Thread DuyHai Doan
Try truncate foo instead of drop table foo.

About the nodetool clearsnapshot, I've experienced the same behavior also
before. Snapshots cleaning is not immediate


On Wed, Feb 26, 2014 at 4:53 PM, Ben Hood 0x6e6...@gmail.com wrote:

 On Wed, Feb 26, 2014 at 3:17 PM, DuyHai Doan doanduy...@gmail.com wrote:
  I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear
 to
  leave the underlying data behind.
 
   -- What do you mean by underlying data ? Are you talking about
  snapshots ?

 I was referring to all of the state related to the particular column
 family I want to set fire to, be it snapshots, parts of commit logs,
 sstables, key caches, row caches, or anything else on or off disk that
 relates to said column family.

  If yes, you can wipe them using nodetool clearsnapshots command

 This is what I'm doing:

 cqlsh:bar drop table foo;

 $ nodetool clearsnapshot bar
 Requested clearing snapshot for: bar

 cqlsh:bar create table foo ();
 cqlsh:bar select * from foo limit 1;

 This returns nothing (as you would expect).

 But this if I re-run this again after about a minute, the data is back.

 I get the same behavior if I use nodetool cleanup, flush, compact or
 repair.

 It's as if there is either a background app process filling the table
 up again or if the deletion hasn't taken place.



Re: Flushing after dropping a column family

2014-02-26 Thread Tupshin Harper
This is a known issue that is fixed in 2.1beta1.
https://issues.apache.org/jira/browse/CASSANDRA-5202

Until 2.1, we do not recommend relying on the recycling of tables through
drop/create or truncate.

However, on a single node cluster, I suspect that truncate will work far
more reliably than drop/recreate.

-Tupshin


On Wed, Feb 26, 2014 at 10:53 AM, Ben Hood 0x6e6...@gmail.com wrote:

 On Wed, Feb 26, 2014 at 3:17 PM, DuyHai Doan doanduy...@gmail.com wrote:
  I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear
 to
  leave the underlying data behind.
 
   -- What do you mean by underlying data ? Are you talking about
  snapshots ?

 I was referring to all of the state related to the particular column
 family I want to set fire to, be it snapshots, parts of commit logs,
 sstables, key caches, row caches, or anything else on or off disk that
 relates to said column family.

  If yes, you can wipe them using nodetool clearsnapshots command

 This is what I'm doing:

 cqlsh:bar drop table foo;

 $ nodetool clearsnapshot bar
 Requested clearing snapshot for: bar

 cqlsh:bar create table foo ();
 cqlsh:bar select * from foo limit 1;

 This returns nothing (as you would expect).

 But this if I re-run this again after about a minute, the data is back.

 I get the same behavior if I use nodetool cleanup, flush, compact or
 repair.

 It's as if there is either a background app process filling the table
 up again or if the deletion hasn't taken place.



Re: Flushing after dropping a column family

2014-02-26 Thread DuyHai Doan
I've played with it using a 2 nodes cluster with auto_snapshot = false in
cassandra.yaml and by deactivating durable write (no commitlog). In my
case, truncating tables allows cleaning up data.

 With nodetool status, I can see the data payload decreasing from Gb to
some kbytes


On Wed, Feb 26, 2014 at 4:59 PM, Tupshin Harper tups...@tupshin.com wrote:

 This is a known issue that is fixed in 2.1beta1.
 https://issues.apache.org/jira/browse/CASSANDRA-5202

 Until 2.1, we do not recommend relying on the recycling of tables through
 drop/create or truncate.

 However, on a single node cluster, I suspect that truncate will work far
 more reliably than drop/recreate.

 -Tupshin


 On Wed, Feb 26, 2014 at 10:53 AM, Ben Hood 0x6e6...@gmail.com wrote:

 On Wed, Feb 26, 2014 at 3:17 PM, DuyHai Doan doanduy...@gmail.com
 wrote:
  I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear
 to
  leave the underlying data behind.
 
   -- What do you mean by underlying data ? Are you talking about
  snapshots ?

 I was referring to all of the state related to the particular column
 family I want to set fire to, be it snapshots, parts of commit logs,
 sstables, key caches, row caches, or anything else on or off disk that
 relates to said column family.

  If yes, you can wipe them using nodetool clearsnapshots command

 This is what I'm doing:

 cqlsh:bar drop table foo;

 $ nodetool clearsnapshot bar
 Requested clearing snapshot for: bar

 cqlsh:bar create table foo ();
 cqlsh:bar select * from foo limit 1;

 This returns nothing (as you would expect).

 But this if I re-run this again after about a minute, the data is back.

 I get the same behavior if I use nodetool cleanup, flush, compact or
 repair.

 It's as if there is either a background app process filling the table
 up again or if the deletion hasn't taken place.





Re: Naive question about orphan rows

2014-02-26 Thread Edward Capriolo
Right the problem with building a list of counts in a batch is what happens
if song added as you are building the counts.


On Wed, Feb 26, 2014 at 10:32 AM, Green, John M (HP Education) 
john.gr...@hp.com wrote:

  Edward,


 Thanks for your insight.



 One other thought I had was to store a reference count with the song.
 When the last playlist referencing the song is deleted the song will
 also be deleted because the reference count decrements to zero.   However,
 this would create some nastiness when it comes to reliably maintaining
 reference counts.   I'm not sure if it would help to split the reference
 count into two monotonically increasing counters (number of references
 added, and number of references deleted).



 In my case, users cannot browse a repository of songs to build a
 playlist from scratch.  They can only import songs themselves or create
 references to songs other users have explicitly made available to them.
 Once a song is not referred to by any playlist it will never be
 re-discovered so it should be deleted.   This could be done in some sort of
 background data maintenance job that runs periodically.   Even if it is a
 low-priority background job it look like it will create a lot overhead
 (scanning and producing counts).



 John

 *From:* Edward Capriolo [mailto:edlinuxg...@gmail.com]
 *Sent:* Wednesday, February 26, 2014 5:56 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: Naive question about orphan rows



 It is probably ok to have redundant songs in playlists, cassandra is about
 denormalization.

 Dealing with this issue is going to be hard since the only way to dwal
 with this would be scanning through the firsr cf and procing counts then
 using that information to delete in the second table. However that
 information can change rapidly and then will fall out of sink fast.

 The only ways yo handle this are

 1) never delete songs
 2) store copies of songs ib playlist

 On Friday, February 21, 2014, Green, John M (HP Education) 
 john.gr...@hp.com wrote:
  I'm very much a newbie so this may be a silly question but ...
 
 
 
  I have a situation similar to the music service example (
 http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html)
 of songs and playlists.  However, in my case, the songs would be
 considered orphans that should be deleted when no playlists refer to
 them.  Relational databases have mechanisms to manage this relationship so
 that a song could be deleted as soon as the last playlist referencing
 it is deleted.While I do NOT need to manage this as an atomic
 transaction, I'm wondering what is the best way to delete orphaned rows
 (i.e., songs not referenced by any playlists)  using Cassandra.
 
 
 
  I guess an alternative approach would be to store songs directly in
 the playlists but this could lead to many redundant copies of the same
 song which is something I'm hoping to avoid.  I'm my case the playlists
 could have thousands of entries and the songs might be blobs of 10s of
 Mbytes.Maybe I'm just having a hard time abandoning my relational roots?
 
 
 
  John

 --
 Sorry this was sent from mobile. Will do less grammar and spell check than
 usual.



Re: Flushing after dropping a column family

2014-02-26 Thread Ben Hood
On Wed, Feb 26, 2014 at 3:58 PM, DuyHai Doan doanduy...@gmail.com wrote:
 Try truncate foo instead of drop table foo.

 About the nodetool clearsnapshot, I've experienced the same behavior also
 before. Snapshots cleaning is not immediate

I get the same behavior with truncate as well.


Re: Flushing after dropping a column family

2014-02-26 Thread Ben Hood
On Wed, Feb 26, 2014 at 3:59 PM, Tupshin Harper tups...@tupshin.com wrote:
 This is a known issue that is fixed in 2.1beta1.
 https://issues.apache.org/jira/browse/CASSANDRA-5202

 Until 2.1, we do not recommend relying on the recycling of tables through
 drop/create or truncate.

 However, on a single node cluster, I suspect that truncate will work far
 more reliably than drop/recreate.


Cool, thanks for the heads up.

Short of using 2.1beta1, is there a more stable way of recycling tables?


RE: Supported Cassandra version for CentOS 5.5

2014-02-26 Thread Donald Smith
Oh, I should add that I was trying to use Cassandra 2.0.X on CentOS and it 
needed CentOS 6.2+.

Don

From: Arindam Barua [mailto:aba...@247-inc.com]
Sent: Wednesday, February 26, 2014 1:52 AM
To: user@cassandra.apache.org
Subject: RE: Supported Cassandra version for CentOS 5.5


I am running Cassandra 1.2.12 on CentOS 5.10.
Was running 1.1.15 previously without any issues as well.

-Arindam

From: Donald Smith [mailto:donald.sm...@audiencescience.com]
Sent: Tuesday, February 25, 2014 3:40 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: RE: Supported Cassandra version for CentOS 5.5


I was unable to get cassandra working with CentOS 5.X . I needed to use CentOS 
6.2 or 6.4.



Don


From: Hari Rajendhran hari.rajendh...@tcs.commailto:hari.rajendh...@tcs.com
Sent: Tuesday, February 25, 2014 2:34 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Supported Cassandra version for CentOS 5.5

Hi,

Currently i am using CentOS 5.5 OS.I need a clarification on the latest 
cassandra version(preferably 2.0.4) that my OS supports.



Best Regards
Hari Krishnan Rajendhran
Hadoop Admin
DESS-ABIM ,Chennai BIGDATA Galaxy
Tata Consultancy Services
Cell:- 9677985515
Mailto: hari.rajendh...@tcs.commailto:hari.rajendh...@tcs.com
Website: http://www.tcs.comhttp://www.tcs.com/

Experience certainty.IT Services
   Business Solutions
   Consulting


=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you


Re: Supported Cassandra version for CentOS 5.5

2014-02-26 Thread Tim Dunphy
I am running Cassandra 2.0.5 on CentOS 5.9 without issue. Getting
CassandraPDO to work with with PHP... well that's another matter entirely.
I haven't had any luck there at all. I may have to move to Centos 6.x for
that reason alone!

Tim


On Wed, Feb 26, 2014 at 11:55 AM, Donald Smith 
donald.sm...@audiencescience.com wrote:

  Oh, I should add that I was trying to use Cassandra 2.0.X on CentOS and
 it needed CentOS 6.2+.



 Don



 *From:* Arindam Barua [mailto:aba...@247-inc.com]
 *Sent:* Wednesday, February 26, 2014 1:52 AM

 *To:* user@cassandra.apache.org
 *Subject:* RE: Supported Cassandra version for CentOS 5.5





 I am running Cassandra 1.2.12 on CentOS 5.10.

 Was running 1.1.15 previously without any issues as well.



 -Arindam



 *From:* Donald Smith 
 [mailto:donald.sm...@audiencescience.comdonald.sm...@audiencescience.com]

 *Sent:* Tuesday, February 25, 2014 3:40 PM
 *To:* user@cassandra.apache.org
 *Subject:* RE: Supported Cassandra version for CentOS 5.5



 I was unable to get cassandra working with CentOS 5.X . I needed to use
 CentOS 6.2 or 6.4.



 Don
   --

 *From:* Hari Rajendhran hari.rajendh...@tcs.com
 *Sent:* Tuesday, February 25, 2014 2:34 AM
 *To:* user@cassandra.apache.org
 *Subject:* Supported Cassandra version for CentOS 5.5



 Hi,

 Currently i am using CentOS 5.5 OS.I need a clarification on the latest
 cassandra version(preferably 2.0.4) that my OS supports.



 Best Regards
 Hari Krishnan Rajendhran
 Hadoop Admin
 DESS-ABIM ,Chennai BIGDATA Galaxy
 Tata Consultancy Services
 Cell:- 9677985515
 Mailto: hari.rajendh...@tcs.com
 Website: http://www.tcs.com
 
 Experience certainty.IT Services
Business Solutions
Consulting
 

 =-=-=
 Notice: The information contained in this e-mail
 message and/or attachments to it may contain
 confidential or privileged information. If you are
 not the intended recipient, any dissemination, use,
 review, distribution, printing or copying of the
 information contained in this e-mail message
 and/or attachments to it are strictly prohibited. If
 you have received this communication in error,
 please notify us by reply e-mail or telephone and
 immediately and permanently delete the message
 and any attachments. Thank you




-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B


Re: Flushing after dropping a column family

2014-02-26 Thread Robert Wille
I use truncate between my test cases. Never had a problem with one test
case inheriting the data from the previous one. I¹m using a single node,
so that may be why.

On 2/26/14, 9:27 AM, Ben Hood 0x6e6...@gmail.com wrote:

On Wed, Feb 26, 2014 at 3:58 PM, DuyHai Doan doanduy...@gmail.com wrote:
 Try truncate foo instead of drop table foo.

 About the nodetool clearsnapshot, I've experienced the same behavior
also
 before. Snapshots cleaning is not immediate

I get the same behavior with truncate as well.




Commit logs building up

2014-02-26 Thread Christopher Wirt
We're running 2.0.5, recently upgraded from 1.2.14.

 

Sometimes we are seeing CommitLogs starting to build up. 

 

Is this a potential bug? Or a symptom of something else we can easily
address?

 

We have 

commitlog_sync: periodic

commitlog_sync_period_in_ms:1

commitlog_segment_size_in_mb: 512

 

 

Thanks,

Chris



Re: Naive question about orphan rows

2014-02-26 Thread Edward Capriolo
One way to handle this is that both tables should be de-normalized. Take
this:

SongsAndPlaylists
PlaylistsAndSongs

In this way your client software is charged with keeping data in sync.

When you remove a song from a PlaylistsAndSongs you do a read for that song
in SongsAnyPlaylists. If the number of people with that song is now 0 the
song can be removed.




On Wed, Feb 26, 2014 at 11:17 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

 Right the problem with building a list of counts in a batch is what
 happens if song added as you are building the counts.


 On Wed, Feb 26, 2014 at 10:32 AM, Green, John M (HP Education) 
 john.gr...@hp.com wrote:

  Edward,


 Thanks for your insight.



 One other thought I had was to store a reference count with the song.
 When the last playlist referencing the song is deleted the song will
 also be deleted because the reference count decrements to zero.   However,
 this would create some nastiness when it comes to reliably maintaining
 reference counts.   I'm not sure if it would help to split the reference
 count into two monotonically increasing counters (number of references
 added, and number of references deleted).



 In my case, users cannot browse a repository of songs to build a
 playlist from scratch.  They can only import songs themselves or create
 references to songs other users have explicitly made available to them.
 Once a song is not referred to by any playlist it will never be
 re-discovered so it should be deleted.   This could be done in some sort of
 background data maintenance job that runs periodically.   Even if it is a
 low-priority background job it look like it will create a lot overhead
 (scanning and producing counts).



 John

 *From:* Edward Capriolo [mailto:edlinuxg...@gmail.com]
 *Sent:* Wednesday, February 26, 2014 5:56 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: Naive question about orphan rows



 It is probably ok to have redundant songs in playlists, cassandra is
 about denormalization.

 Dealing with this issue is going to be hard since the only way to dwal
 with this would be scanning through the firsr cf and procing counts then
 using that information to delete in the second table. However that
 information can change rapidly and then will fall out of sink fast.

 The only ways yo handle this are

 1) never delete songs
 2) store copies of songs ib playlist

 On Friday, February 21, 2014, Green, John M (HP Education) 
 john.gr...@hp.com wrote:
  I'm very much a newbie so this may be a silly question but ...
 
 
 
  I have a situation similar to the music service example (
 http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html)
 of songs and playlists.  However, in my case, the songs would be
 considered orphans that should be deleted when no playlists refer to
 them.  Relational databases have mechanisms to manage this relationship so
 that a song could be deleted as soon as the last playlist referencing
 it is deleted.While I do NOT need to manage this as an atomic
 transaction, I'm wondering what is the best way to delete orphaned rows
 (i.e., songs not referenced by any playlists)  using Cassandra.
 
 
 
  I guess an alternative approach would be to store songs directly in
 the playlists but this could lead to many redundant copies of the same
 song which is something I'm hoping to avoid.  I'm my case the playlists
 could have thousands of entries and the songs might be blobs of 10s of
 Mbytes.Maybe I'm just having a hard time abandoning my relational roots?
 
 
 
  John

 --
 Sorry this was sent from mobile. Will do less grammar and spell check
 than usual.





Re: Update multiple rows in a CQL lightweight transaction

2014-02-26 Thread Clint Kelly
Thanks for your help everyone.

Sylvain, as I understand it, the scenario I described above is not resolved
by CASSANDRA-6561, correct?

(This scenario may not matter to most folks, which is totally fine, I just
want to make sure that I understand.)

Should I instead look into using the Thrift API to address this?

Best regards,
Clint



On Tue, Feb 25, 2014 at 11:30 PM, Sylvain Lebresne sylv...@datastax.comwrote:

 Sorry to interfere again here but CASSANDRA-5633 will not be picked up
 because pretty much everything it was set to fix is fixed by
 CASSANDRA-6561, this is *not* a syntax problem anymore.


 On Wed, Feb 26, 2014 at 3:18 AM, Tupshin Harper tups...@tupshin.comwrote:

 Unfortunately there is no option to vote for a resolved ticket, but if
 you can propose a better syntax that people agree on, you could probably
 get some fresh traction on it.

 -Tupshin
  On Feb 25, 2014 7:20 PM, Clint Kelly clint.ke...@gmail.com wrote:

 Hi Tupshin,

 Thanks for your help!  Unfortunately in my case, I will need to do a
 compare and set in which the compare is against a value in a dynamic column.

 In general, I need to be able to do the following:

- Check whether a given value exists in a dynamic column
- If so, perform some number of insertions / deletions for dynamic
columns in the same row (i.e., with the same partition key as the dynamic
column used for the compare)

 I think you are correct that I need
 https://issues.apache.org/jira/browse/CASSANDRA-5633 to be
 implemented.  Is there any way to vote for that to get picked up again?  :)

 Best regards,
 Clint





 On Mon, Feb 24, 2014 at 2:32 PM, Tupshin Harper tups...@tupshin.comwrote:

 Hi Clint,

 That does appear to be an omission in CQL3. It would be possible to
 simulate it by doing
 BEGIN BATCH
 UPDATE foo SET z = 10 WHERE x = 'a' AND y = 1 IF t= 2 AND z=10;
 UPDATE foo SET t = 5,z=6 where x = 'a' AND y = 4
 APPLY BATCH;

 However, this does a redundant write to the first row if the condition
 holds, and I certainly wouldn't recommend doing that routinely.

 Alternatively, depending on your needs, you might be able to use a
 static column (coming with 2.0.6) as your conditional flag, as that column
 is shared by all rows in the partition.

 -Tupshin



 On Mon, Feb 24, 2014 at 3:57 PM, Clint Kelly clint.ke...@gmail.comwrote:

 Hi Tupshin,

 Thanks for your help; I appreciate it.

 Could I do something like the following?

 Given the same table you started with:

 x | y | t | z
 ---+---+---+
  a | 1 | 2 | 10
  a | 2 | 2 | 20

 I'd like to write a compare-and-set that does something like:

 If there is a row with (x,y,t,z) = (a,1,2,10), then update/insert a
 row with (x,y,t,z) = (a,3,4,5) and update/insert a row with (x,y,t,z)
 = (a,4,5,6).


 I don't see how I could do this with what you outlined above---just
 curious.  It seems like what I describe above under the hood would be
 a compare-and-(batch)-set on a single wide row, so it maybe is
 possible with the Thrift API (I have to check).

 Thanks again!

 Best regards,
 Clint

 On Sat, Feb 22, 2014 at 11:38 AM, Tupshin Harper tups...@tupshin.com
 wrote:
  #5633 was actually closed  because the static columns feature
  (https://issues.apache.org/jira/browse/CASSANDRA-6561) which has
 been
  checked in to the 2.0 branch but is not yet part of a release (it
 will be in
  2.0.6).
 
  That feature will let you update multiple rows within a single
 partition by
  doing a CAS write based on a static column shared by all rows within
 the
  partition.
 
  Example extracted from the ticket:
  CREATE TABLE foo (
  x text,
  y bigint,
  t bigint static,
  z bigint,
  PRIMARY KEY (x, y) );
 
  insert into foo (x,y,t, z) values ('a', 1, 1, 10);
  insert into foo (x,y,t, z) values ('a', 2, 2, 20);
 
  select * from foo;
 
  x | y | t | z
  ---+---+---+
   a | 1 | 2 | 10
   a | 2 | 2 | 20
  (Note that both values of t are 2 because it is static)
 
 
   begin batch update foo set z = 1 where x = 'a' and y = 1; update
 foo set z
  = 2 where x = 'a' and y = 2 if t = 4; apply batch;
 
   [applied] | x | y| t
  ---+---+--+---
   False | a | null | 2
 
  (Both updates failed to apply because there was an unmet conditional
 on one
  of them)
 
  select * from foo;
 
   x | y | t | z
  ---+---+---+
   a | 1 | 2 | 10
   a | 2 | 2 | 20
 
 
  begin batch update foo set z = 1 where x = 'a' and y = 1; update foo
 set z =
  2 where x = 'a' and y = 2 if t = 2; apply batch;
 
   [applied]
  ---
True
 
  (both updates succeeded because the check on t succeeded)
 
  select * from foo;
  x | y | t | z
  ---+---+---+---
   a | 1 | 2 | 1
   a | 2 | 2 | 2
 
  Hope this helps.
 
  -Tupshin
 
 
 
  On Fri, Feb 21, 2014 at 6:05 PM, DuyHai Doan doanduy...@gmail.com
 wrote:
 
  Hello Clint
 
   The Resolution status of the JIRA is set to Later, probably the
  implementation is not done yet. The JIRA was opened to discuss
 about impl
  strategy but 

Combine multiple SELECT statements into one RPC?

2014-02-26 Thread Clint Kelly
Hi all,

Is there any way to use the DataStax Java driver to combine multiple SELECT
statements into a single RPC?  I assume not (I could not find anything
about this in the documentation), but I just wanted to check.

Thanks!

Best regards,
Clint


Re: CQL decimal encoding

2014-02-26 Thread Ben Hood
On Wed, Feb 26, 2014 at 12:10 PM, Laing, Michael
michael.la...@nytimes.com wrote:
 go uses 'zig-zag' encoding, perhaps that is the difference?


 On Wed, Feb 26, 2014 at 6:52 AM, Peter Lin wool...@gmail.com wrote:


 You may need to bit shift if that is the case

Thanks for everybody's help, I've managed to solve the issue: the
unscaled part of the decimal needs to be encoded using two's
compliment. Neither the standard Go big.Rat type nor the more amenable
replacement inf.Dec use two's compliment encoding, which is what Java
BigDecimal and the other languages are doing.

Ironically, the code to do the two's compliment packing and unpacking
is available in the the asn1 module of the standard Go library.
Unfortunately the functions are not exported outside the package
scope, since they designed for internal use only. So open source to
the rescue.

Hopefully the gocql team can code review this soon and if that's good
to go, we'll have another CQL driver that can deal with decimals.


Re: CQL decimal encoding

2014-02-26 Thread Ben Hood
On Thu, Feb 27, 2014 at 12:01 AM, Ben Hood 0x6e6...@gmail.com wrote:
 Hopefully the gocql team can code review this soon and if that's good
 to go, we'll have another CQL driver that can deal with decimals.

BTW thanks and kudos go to Theo and Tyler (of the cql-rb and the
datastax python drivers respectively) for publishing encoding test
cases for the decimal type - that was quite helpful :-)


Re: CQL decimal encoding

2014-02-26 Thread Ben Hood
On Thu, Feb 27, 2014 at 12:05 AM, Ben Hood 0x6e6...@gmail.com wrote:
 BTW thanks and kudos go to Theo and Tyler (of the cql-rb and the
 datastax python drivers respectively) for publishing encoding test
 cases for the decimal type - that was quite helpful :-)

Sorry, I forgot to mention the inspiration gained from Paul's Perl
encoding specs as well.

And just generally thanks to everybody for chiming in :-)


Background flushing appears to peg CPU

2014-02-26 Thread Ben Hood
Hi,

Using Cassandra 2.0.5 we seem to be running into an issue with a
continuous flush of a column family that has no current data ingress.
After disconnecting all clients from the node, the Cassandra instance
seems to continuously flushing a specific column family with this line
appearing all over the logs:

INFO [OptionalTasks:1] 2014-02-27 07:36:39,366 MeteredFlusher.java
(line 63) - flushing high-traffic column family CFS(Keyspace='bar',
ColumnFamily='foo') (estimated 70078271 bytes)

Restarting the node didn't appear to change the situation.

Does anybody know why this might be happening for a column family that
appears not to be receiving any writes?

Cheers,

Ben