Re: which replica has your data?

2014-04-22 Thread Robert Coli
On Tue, Apr 22, 2014 at 1:55 PM, Russell Bradberry wrote: > nodetool getendpoints > That will tell OP what nodes *should* have the row... to answer which of those replicas *actually have* the row, call the JMX method getSSTablesForKey, on each node returned by getendpoints. If there is at leas

Re: which replica has your data?

2014-04-22 Thread Russell Bradberry
nodetool getendpoints On April 22, 2014 at 4:52:08 PM, Han,Meng (meng...@ufl.edu) wrote: Hi all, I have a data item whose row key is 7573657238353137303937323637363334393636363230 and I have a five node Cassandra cluster with replication factor set to 3. Each replica's token is list

which replica has your data?

2014-04-22 Thread Han,Meng
Hi all, I have a data item whose row key is 7573657238353137303937323637363334393636363230 and I have a five node Cassandra cluster with replication factor set to 3. Each replica's token is listed below TOK: 0 TOK: 34028236692093846346337460743176821145 TOK: 6805647338418769269267492148635364

Re: fixed size collection possible?

2014-04-22 Thread Tupshin Harper
No there isn't, though I would like to see such a feature, albeit more at the CQL partition layer rather than the collection layer. Anyway, that is sometimes referred to as a capped collection in other dbs, and you might find the history in this ticket interesting. It points to ways to simulate the

Re: Doubt

2014-04-22 Thread Chris Lohfink
Generally Ive seen it recommended to do a composite CF since it gives you more flexibility and its easier to debug. You can get some performance improvements by storing a serialized blob (a lot of data you can represent much smaller this way by factor of 10 or more if clever) to represent your

Re: fixed size collection possible?

2014-04-22 Thread Chris Lohfink
It isn’t natively supported but theres some things you can do if need it. A lot depends on how frequently this list is getting updated. For heavier workloads I would recommend using a custom CF for this instead of collections. If extreme inserts you would want to add additional partitioning to

Re: Deleting column names

2014-04-22 Thread Laing, Michael
Your understanding is incorrect - the easiest way to see that is to try it. On Tue, Apr 22, 2014 at 12:00 PM, Sebastian Schmidt wrote: > From my understanding, this would delete all entries with the given s. > Meaning, if I have inserted (sa, p1, o1, c1) and (sa, p2, o2, c2), > executing this: >

Re: Deleting column names

2014-04-22 Thread Sebastian Schmidt
>From my understanding, this would delete all entries with the given s. Meaning, if I have inserted (sa, p1, o1, c1) and (sa, p2, o2, c2), executing this: DELETE FROM table_name WHERE s = sa AND p = p1 AND o = o1 AND c = c1 would delete sa, p1, o1, c1, p2, o2, c2. Is this correct? Or does the abo

BulkOutputFormat and CQL3

2014-04-22 Thread James Campbell
Hi Cassandra Users- I have a Hadoop job that uses the pattern in Cassandra 2.0.6's hadoop_cql3_word_count example to load data from HDFS into Cassandra. Having read about BulkOutputFormat as a way to potentially significantly increase the write throughput from Hadoop to Cassandra, I am conside

Re: Does NetworkTopologyStrategy in Cassandra 2.0 work?

2014-04-22 Thread horschi
Ok, it seems 2.0 now is simply stricter about datacenter names. I simply had to change the datacenter name to match the name in "nodetool ring": update keyspace MYKS with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {datacenter1 : 2}; So the schema was wrong, but 1.2 did

Does NetworkTopologyStrategy in Cassandra 2.0 work?

2014-04-22 Thread horschi
Hi, is it possible that NetworkTopologyStrategy does not work with Cassandra 2.0 any more? I just updated my Dev Cluster to 2.0.7 and got UnavailableExceptions for CQL&Thrift queries on my already existing column families, even though all (two) nodes were up. Changing to SimpleStrategy fixed the

Re: Deleting column names

2014-04-22 Thread Laing, Michael
Referring to the original post, I think the confusion is what is a "row" in this context: So as far as I understand, the s column is now the *row *key ... Since I have multiple different p, o, c combinations per s, deleting the whole > *row* identified by s is no option The s column is in fact