Re: Multinode Cassandra and sstableloader

2015-04-01 Thread Alain RODRIGUEZ
From Michael Laing - posted on the wrong thread : We use Alain's solution as well to make major operational revisions. We have a red team and a blue team in each AWS region, so we just add and drop datacenters to get where we want to be. Pretty simple. 2015-03-31 15:50 GMT+02:00 Alain

Testing sstableloader between Cassandra 2.1 DSE and community edition 2.1

2015-04-01 Thread Serega Sheypak
Hi, I have 2 cassandra clusters. cluster1 is datastax community 2.1 cluster2 is datastax DSE I can run sstableloader from cluster1(Community) and stream data to cluster2 (DSE) But I get exception while streaming from cluster2 (DSE) to cluster1 (Community) The expection is: Could not retrieve

Re: SSTable structure

2015-04-01 Thread Serega Sheypak
Hi bharat, you are talking about Cassandra 1.2.5 Does it fit Cassandra 2.1? Were there any significant changes to SSTable format and layout? Thank you, article is interesting. Hi jacob jacob.rho...@me.com, HBase does it for example. http://hbase.apache.org/book.html#_hfile_format_2 It would be

Re: Testing sstableloader between Cassandra 2.1 DSE and community edition 2.1

2015-04-01 Thread Serega Sheypak
Sorry cluster1 community version is: ii cassandra 2.1.3 distributed storage system for structured data cluster2 DSE version is: ii dse-libcassandra4.6.2-1 The DataStax Enterprise package includes a production-certifie 2015-04-01 14:53 GMT+02:00 Serega Sheypak

[SECURITY ANNOUNCEMENT] CVE-2015-0225

2015-04-01 Thread Jake Luciani
CVE-2015-0225: Apache Cassandra remote execution of arbitrary code Severity: Important Vendor: The Apache Software Foundation Versions Affected: Cassandra 1.2.0 to 1.2.19 Cassandra 2.0.0 to 2.0.13 Cassandra 2.1.0 to 2.1.3 Description: Under its default configuration, Cassandra binds an

Frequent timeout issues

2015-04-01 Thread Amlan Roy
Hi, I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE. The cluster size for

Re: Frequent timeout issues

2015-04-01 Thread Eric R Medley
Amlan, Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out? Regards, Eric R Medley On Apr 1, 2015, at 9:03 AM, Amlan Roy amlan@cleartrip.com wrote: Hi, I am new to Cassandra. I have

Datastax driver object mapper and union field

2015-04-01 Thread Craig Ching
Hi! We need to implement a union field in our cassandra data model and we're using the datastax Mapper. Anyone have any recommendations for doing this? I'm thinking something like: public class Value { int dataType; String valueAsString; double valueAsDouble; } If the Value is a String,

Re: Frequent timeout issues

2015-04-01 Thread Eric R Medley
Also, can you provide the table details and the consistency level you are using? Regards, Eric R Medley On Apr 1, 2015, at 9:13 AM, Eric R Medley emed...@xylocore.com wrote: Amlan, Can you provide information on how much data is being written? Are any of the columns really large? Are

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Hi Eric, Thanks for the reply. Some columns are big but I see the issue even when I stop storing the big columns. Some of the writes are timing out, not all. Where can I find the number of writes to Cassandra? Regards, Amlan On 01-Apr-2015, at 7:43 pm, Eric R Medley emed...@xylocore.com

Table design for historical data

2015-04-01 Thread Firdousi Farozan
Hi, My requirement is to design a table for historical state information (not exactly time-series). For ex: I have devices connecting and disconnecting to the management platform. I want to know the details such as (name, mac, os, image, etc.) for all devices connected to the management platform

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Did not see any exception in cassandra.log and system.log. Monitored using JConsole. Did not see anything wrong. Do I need to see any specific info? Doing almost 1000 writes/sec. HBase and Cassandra are running on different clusters. For cassandra I have 6 nodes with 64GB RAM(Heap is at

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Using the datastax driver without batch. http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/whatsNew2.html On 01-Apr-2015, at 9:15 pm, Brian O'Neill b...@alumni.brown.edu wrote: Are you using the storm-cassandra-cql driver?

Re: Frequent timeout issues

2015-04-01 Thread Eric R Medley
Are HBase and Cassandra running on the same servers? Are the writes to each of these databases happening at the same time? Regards, Eric R Medley On Apr 1, 2015, at 10:12 AM, Brice Dutheil brice.duth...@gmail.com wrote: And the keyspace? What is the replication factor. Also how are the

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Replication factor is 2. CREATE KEYSPACE ct_keyspace WITH replication = { 'class': 'NetworkTopologyStrategy', 'DC1': '2' }; Inserts are happening from Storm using java driver. Using prepared statement without batch. On 01-Apr-2015, at 8:42 pm, Brice Dutheil brice.duth...@gmail.com wrote:

Re: Frequent timeout issues

2015-04-01 Thread Brice Dutheil
And the keyspace? What is the replication factor. Also how are the inserts done? On Wednesday, April 1, 2015, Amlan Roy amlan@cleartrip.com wrote: Write consistency level is ONE. This is the describe output for one of the tables. CREATE TABLE event_data ( event text, week text,

Re: Frequent timeout issues

2015-04-01 Thread Amlan Roy
Write consistency level is ONE. This is the describe output for one of the tables. CREATE TABLE event_data ( event text, week text, bucket int, date timestamp, unique text, adt int, age listint, arrival listtimestamp, bank text, bf double, cabin text, card text, carrier

Re: Frequent timeout issues

2015-04-01 Thread Brian O'Neill
Are you using the storm-cassandra-cql driver? (https://github.com/hmsonline/storm-cassandra-cql) If so, what version? Batching or no batching? -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42

replace_address vs add+removenode

2015-04-01 Thread Ulrich Geilmann
Hi. The documentation suggests to use the replace_address startup parameter for replacing a dead node. However, it doesn't motivate why this is superior over adding a new node and retiring the dead one using nodetool removenode. I assume it can be more efficient since the new node can take over

Re: Frequent timeout issues

2015-04-01 Thread Eric R Medley
Are you seeing any exceptions in the cassandra logs? What are the loads on your servers? Have you monitored the performance of those servers? How many writes are you performing at a time? How many writes per seconds? Regards, Eric R Medley On Apr 1, 2015, at 9:40 AM, Amlan Roy

Re: Table design for historical data

2015-04-01 Thread Firdousi Farozan
I will be writing an event when device connects. Probably a device never disconnects till current time, and I want to return that device for that time range. Device disconnect is used to mark the end time; Any query beyond that time should not return that device. Queries can have adhoc start and

Re: Why select returns tombstoned results?

2015-04-01 Thread Benyi Wang
Unfortunately I'm using 2.1.2. Is it possible that I downgrade to 2.0.13 without wiping out the data? I'm worrying about if there is a bug in 2.1.2. On Tue, Mar 31, 2015 at 4:37 AM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: What version of Cassandra are you running?

Re: replace_address vs add+removenode

2015-04-01 Thread Anuj Wadehra
In both cases node needs to bootstrap and get data frm other nodes. Removenode has an additional cost as it will lead to additional redistribution of tokens such that all data resides on remaining nodes as per the replication strategy. On removenode, remaining nodes will stream data amongst

Re: replace_address vs add+removenode

2015-04-01 Thread Robert Coli
On Wed, Apr 1, 2015 at 9:26 AM, Ulrich Geilmann ulrich.geilm...@freiheit.com wrote: I assume it can be more efficient since the new node can take over the exact tokens of the dead node. Are there any other differences? That's the reason. You get one streaming operation (bootstrap a new node

Re: Frequent timeout issues

2015-04-01 Thread Robert Coli
On Wed, Apr 1, 2015 at 8:37 AM, Amlan Roy amlan@cleartrip.com wrote: Replication factor is 2. It is relatively unusual for people to use a replication factor of 2, for what it's worth. =Rob

Re: Table design for historical data

2015-04-01 Thread Eric R Medley
Firdousi, What kind of events would be stored in the table? Will you be writing an event when a device connects and another when it disconnects or will you write a single event after the device finally disconnects? Also, for your queries, do you want ad-hoc start and end times or do you have a

Re: Frequent timeout issues

2015-04-01 Thread Anuj Wadehra
Are you writing multiple cf at same time? Please run nodetool tpstats to make sure that FlushWriter etc doesnt have high All time blocked counts. A Blocked memtable FlushWriter may block/drop writes. If thats the case you may need to increase memtable flush writers..if u have many secondary

Re: Testing sstableloader between Cassandra 2.1 DSE and community edition 2.1

2015-04-01 Thread Michael Shuler
On 04/01/2015 08:10 AM, Serega Sheypak wrote: Sorry cluster1 community version is: ii cassandra 2.1.3 distributed storage system for structured data cluster2 DSE version is: ii dse-libcassandra4.6.2-1 The DataStax Enterprise package includes a

Re: Testing sstableloader between Cassandra 2.1 DSE and community edition 2.1

2015-04-01 Thread Serega Sheypak
Got it. 2015-04-01 20:39 GMT+02:00 Michael Shuler mich...@pbandjelly.org: On 04/01/2015 08:10 AM, Serega Sheypak wrote: Sorry cluster1 community version is: ii cassandra 2.1.3 distributed storage system for structured data cluster2 DSE version is: ii

Re: Cross-datacenter requests taking a very long time.

2015-04-01 Thread Bharatendra Boddu
What type of snitch are you using for cassandra.yaml: endpoint_snitch ? PropertyFileSnitch can improve performance. - bharat On Tue, Mar 31, 2015 at 1:59 PM, daemeon reiydelle daeme...@gmail.com wrote: What is your replication factor? Any idea how much data has to be processed under the

Re: SSTable structure

2015-04-01 Thread Bharatendra Boddu
Hi Serega, Most of the content in the blog article is still relevant. After 1.2.5 (ic), there are only three new versions (ja, jb, ka) for SSTable format. Following are the changes in these versions. // ja (2.0.0): super columns are serialized as composites (note that there is no real

Re: Why select returns tombstoned results?

2015-04-01 Thread Benyi Wang
All servers are running ntpd. I guess the time should be synced across all servers. My dataset is too large to use sstable2json. It would take long time. I will try to repair to see if the issue is gone. On Tue, Mar 31, 2015 at 7:49 AM, Ken Hancock ken.hanc...@schange.com wrote: Have you

Re: How to store unique visitors in cassandra

2015-04-01 Thread Jim Ancona
Very interesting. I had saved your email from three years ago in hopes of an elegant answer. Thanks for sharing! Jim On Tue, Mar 31, 2015 at 8:16 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: People keep asking me if we finally found a solution (even if this is 3+ years old) so I will just