Gradle script to execute cql3 scripts

2013-09-03 Thread dawood abdullah
I have a requirement to execute CQL3 scripts through Gradle, do we have any cassandra plugin for Gradle to do the same or is there any other way I can execute CQL3 scripts during the build itself. Please suggest. Dawood

read ?

2013-09-03 Thread Langston, Jim
Hi all, Quick question I currently am looking at a 4 node cluster and I have currently stopped all writing to Cassandra, with the reads continuing. I'm trying to understand the utilization of memory within the JVM. nodetool info on each of the nodes shows them all growing in footprint, 2 of

[RELEASE] Apache Cassandra 2.0 released

2013-09-03 Thread Sylvain Lebresne
The Cassandra team is very pleased to announce the release of Apache Cassandra version 2.0.0. Cassandra 2.0.0 is a new major release that adds numerous improvements[1,2], including: - Lightweight transactions[4] that offers linearizable consistency. - Experimental Triggers Support[5]. -

Re: CqlStorage creates wrong schema for Pig

2013-09-03 Thread Chad Johnston
You're trying to use FromCqlColumn on a tuple that has been flattened. The schema still thinks it's {title: chararray}, but the flattened tuple is now two values. I don't know how to retrieve the data values in this case. Your code will work correctly if you do this: *values3 = FOREACH rows

Re: row cache

2013-09-03 Thread Chris Burroughs
On 09/01/2013 03:06 PM, Faraaz Sareshwala wrote: Yes, that is correct. The SerializingCacheProvider stores row cache contents off heap. I believe you need JNA enabled for this though. Someone please correct me if I am wrong here. The ConcurrentLinkedHashCacheProvider stores row cache contents

RE: read ?

2013-09-03 Thread Lohfink, Chris
To get an accurate picture you should force a full GC on each node, the heap utilization can be misleading since there can be a lot of things in the heap with no strong references. There is a number of factors that can lead to this. For a true comparison I would recommend using jconsole and

Re: Recomended storage choice for Cassandra on Amazon m1.xlarge instance

2013-09-03 Thread Andrey Ilinykh
You benefit from putting commit log on separate drive only if this drive is an isolated spinning device. EC2 ephemeral is a virtual device, so I don't think it makes sense to put commit log on a separated drive. I would build raid0 from 4 drives and put everything their. But it would be

Re: Versioning in cassandra

2013-09-03 Thread dawood abdullah
Jan, The solution you gave works spot on, but there is one more requirement I forgot to mention. Following is my table structure CREATE TABLE file ( id text, contenttype text, createdby text, createdtime timestamp, description text, name text, parentid text, version timestamp,

RE: read ?

2013-09-03 Thread Lohfink, Chris
Does it actually OOM eventually? There will be a certain amount of object allocation for reads (or anything) which will see the heap creep up until a GC, but at ~500mb or so of a 8gb heap there is little reason for the JVM to do it so it probably just ignores it to save processing. Even the

Re: Upgrade from 1.0.9 to 1.2.8

2013-09-03 Thread Mike Neir
Ah. I was going by the upgrade recommendations in the NEWS.txt file in the cassandra source tree, which didn't make mention of that version (1.0.11) whatsoever. I didn't see any show-stoppers that would have prevented me from going straight from 1.0.9 to 1.2.x.

Re: Versioning in cassandra

2013-09-03 Thread Vivek Mishra
create secondary index over parentid. OR make it part of clustering key -Vivek On Tue, Sep 3, 2013 at 10:42 PM, dawood abdullah muhammed.daw...@gmail.comwrote: Jan, The solution you gave works spot on, but there is one more requirement I forgot to mention. Following is my table structure

Re: Versioning in cassandra

2013-09-03 Thread Vivek Mishra
My bad. I did miss out to read latest version part. -Vivek On Tue, Sep 3, 2013 at 11:20 PM, dawood abdullah muhammed.daw...@gmail.comwrote: I have tried with both the options creating secondary index and also tried adding parentid to primary key, but I am getting all the files with parentid

RE: map/reduce performance time and sstable readerŠ.

2013-09-03 Thread java8964 java8964
I am trying to do the same thing, as in our project, we want to load the data from Cassandra into Hadoop cluster, and SSTable is one obvious option, as you can get the changed data since last batch loading directly from the SSTable incremental backup files. But, based on so far my research (I

Re: Cassandra cluster migration in Amazon EC2

2013-09-03 Thread Robert Coli
On Mon, Sep 2, 2013 at 4:21 PM, Renat Gilfanov gren...@mail.ru wrote: - Group 3 of storages into raid0 array, move data directory to the raid0, and commit log - to the 4th left storage. - As far as I understand, separation of commit log and data directory should make performance better - but

Re: Versioning in cassandra

2013-09-03 Thread dawood abdullah
I have tried with both the options creating secondary index and also tried adding parentid to primary key, but I am getting all the files with parentid 'yyy', what I want is the latest version of file with the combination of parentid, fileid. Say below are the records inserted in the file table:

Re: read ?

2013-09-03 Thread Langston, Jim
Thanks Chris, I have about 8 heap dumps that I have been looking at. I have been trying to isolate as to why I have be dumping heap, I've started by removing the apps that write to cassandra and eliminating work that would entail. I am left with just the apps that are reading the data and from

Re: map/reduce performance time and sstable readerŠ.

2013-09-03 Thread Hiller, Dean
We are considering creating our own InputFormat for hadoop and running the tasktrackers on every 3rd node(ie. RF=3) such that we cover all ranges. Our M/R overhead appears to be 13 days vs. 12.5 hours on just reading SSTAbles directly on our current data set. I personally don't think parsing

Re: Versioning in cassandra

2013-09-03 Thread Laing, Michael
try the following. -ml -- put this in file and run using 'cqlsh -f file DROP KEYSPACE latest; CREATE KEYSPACE latest WITH replication = { 'class': 'SimpleStrategy', 'replication_factor' : 1 }; USE latest; CREATE TABLE file ( parentid text, -- row_key, same for each version id

Re: Versioning in cassandra

2013-09-03 Thread Vivek Mishra
create table file(id text , parentid text,contenttype text,version timestamp, descr text, name text, PRIMARY KEY(id,version) ) WITH CLUSTERING ORDER BY (version DESC); insert into file (id, parentid, version, contenttype, descr, name) values ('f2', 'd1', '2011-03-06', 'pdf', 'f2 file', 'file1');

Re: [RELEASE] Apache Cassandra 2.0 released

2013-09-03 Thread Jeremiah D Jordan
Thanks for everyone's work on this release! -Jeremiah On Sep 3, 2013, at 8:48 AM, Sylvain Lebresne sylv...@datastax.com wrote: The Cassandra team is very pleased to announce the release of Apache Cassandra version 2.0.0. Cassandra 2.0.0 is a new major release that adds numerous

How to fix host ID collision?

2013-09-03 Thread Renat Gilfanov
Hello, We have Cassandra cluster with 5 nodes hosted in the Amazon EC2, and  I had to restart two of them, so their IPs changed. We use NetworkTopologyStrategy, so I simply updated IPs in the cassandra-topology.properties file. However, as I understood, old IPs remained somewhere in the

RE: Update-Replace

2013-09-03 Thread Baskar Duraikannu
I have a similar use case but only need to update portion of the row. We basically perform single write (with old and new columns) with very low value of ttl for old columns. From: jan.algermis...@nordsc.com Subject: Update-Replace Date: Fri, 30 Aug 2013 17:35:48 +0200 To:

RE: Listblob retrieve performance

2013-09-03 Thread Baskar Duraikannu
I don't know of any. I would check the size of LIST. If it is taking long, it could be just that disk read is taking long. Date: Sat, 31 Aug 2013 16:35:22 -0300 Subject: Listblob retrieve performance From: savio.te...@lupa.inf.ufg.br To: user@cassandra.apache.org I have a column family with

Re: Versioning in cassandra

2013-09-03 Thread Laing, Michael
I use the technique described in my previous message to handle millions of messages and their versions. Actually, I use timeuuid's instead of timestamps, as they have more 'uniqueness'. Also I index my maps by a timeuuid that is the complement (based on a future date) of a current timeuuid. Since

Re: Update-Replace

2013-09-03 Thread Jan Algermissen
Baskar, On 03.09.2013, at 23:11, Baskar Duraikannu baskar.duraika...@outlook.com wrote: I have a similar use case but only need to update portion of the row. We basically perform single write (with old and new columns) with very low value of ttl for old columns. I found out that using

cqlsh error after enabling encryption

2013-09-03 Thread David Laube
Hi All, After enabling encryption on our Cassandra 1.2.8 nodes, we receiving the error Connection error: TSocket read 0 bytes while attempting to use CQLsh to talk to the ring. I've followed the docs over at

Re[2]: How to fix host ID collision?

2013-09-03 Thread Renat Gilfanov
Thanks a lot for the quick reply, Should I run the nodetool repair on all nodes before or after that? Also, it's mentioned in the documentation that auto_bootstrap setting is applied only to non-seed nodes. Currently I specified all nodes as seeds, should I remove nodes with new IP from

Re: Listblob retrieve performance

2013-09-03 Thread Sávio Teles
The list is null. 2013/9/3 Baskar Duraikannu baskar.duraika...@outlook.com I don't know of any. I would check the size of LIST. If it is taking long, it could be just that disk read is taking long. -- Date: Sat, 31 Aug 2013 16:35:22 -0300 Subject: Listblob

Fwd: {kundera-discuss} Kundera 2.7 released

2013-09-03 Thread Vivek Mishra
fyip. -- Forwarded message -- From: Vivek Mishra vivek.mis...@impetus.co.in Date: Wed, Sep 4, 2013 at 6:15 AM Subject: {kundera-discuss} Kundera 2.7 released To: kundera-disc...@googlegroups.com kundera-disc...@googlegroups.com Hi All, We are happy to announce the release of