Dropped Mutation Messages in two DCs at different sites
I need to batch load a lot of data everyday into a keyspace across two DCs, one DC is at west coast and the other is at east coast. I assume that the network delay between two DCs at different sites will cause a lot of dropped mutation messages if I write too fast in LOCAL DC using LOCAL_QUORUM. I did this test: the test cluster has two DCs in one network at the same site, but the configuration of the remote DC is lower than the local one. When I used LOCAL_QUORUM and wrote fast enough, I observed a lot of dropped mutation messages in the remote DC. So I guess the same thing will happen if two DCs are at different sites. To my understanding, the coordinator in the LOCAL dc will send write requests to all copies including the remote copies, and return SUCCESS to the client once the quorum of the copies in LOCAL dc respond. Due to the network delay, the remote side will process the requests with a delay, and new requests to the remote side arrive at the speed of LOCAL dc. Eventually, the requests in the queue will exceed the timeout, and the dropped mutation messages happen. But I am not sure if my analysis is correct because the above analysis doesn't consider that there are more connections than one DC situation and if the network bandwidth slows down the process in LOCAL DC. If my analysis is correct, the solution could be either slow down the batch load speed, or configure remote side with longer timeout. My question is how can I design some tests to find out how slow will be for the batch load to avoid dropped mutation messages at the remote site. If my analysis is wrong, could you explain what actually happens in this situation? Thanks.
Re: Is it safe to change RF in this situation?
Thanks a lot. Will do as you suggested. On Thu, Sep 8, 2016 at 3:08 PM, Hannu Kröger wrote: > Ok, so I have to say that I’m not 100% sure how many replicas of data is > it trying to maintain but it should not blow up (if repair crashes or > something, it’s ok). So it should be safe to do. > > When the repair has run you can start with the plan I suggested and run > repairs afterwards. > > Hannu > > On 8 Sep 2016, at 18:01, Benyi Wang wrote: > > Thanks. What about this situation: > > * Change RF 2 => 3 > * Start repair > * Roll back RF 3 => 2 > * repair is still running > > I'm wondering what the repair is trying to do? The repair is trying to fix > as RF=2 or still trying to fix like RF=3? > > On Thu, Sep 8, 2016 at 2:53 PM, Hannu Kröger wrote: > >> Yep, you can fix it by running repair or even faster by changing the >> consistency level to local_quorum and deploying the new version of the app. >> >> Hannu >> >> On 8 Sep 2016, at 17:51, Benyi Wang wrote: >> >> Thanks Hannu, >> >> Unfortunately, we started changing RF from 2 to 3, and did see the empty >> result rate is going higher. I assume that "If the LOCAL_ONE read hit the >> new replica which is not there yet, the CQL query will return nothing." Is >> my assumption correct? >> >> On Thu, Sep 8, 2016 at 11:49 AM, Hannu Kröger wrote: >> >>> Hi, >>> >>> If you change RF=2 -> 3 first, the LOCAL_ONE reads might hit the new >>> replica which is not there yet. So I would change LOCAL_ONE -> LOCAL_QUORUM >>> first and then change the RF and then run the repair. LOCAL_QUORUM is >>> effectively ALL in your case (RF=2) if you have just one DC, so you can >>> change the batch CL later. >>> >>> Cheers, >>> Hannu >>> >>> > On 8 Sep 2016, at 14:42, Benyi Wang wrote: >>> > >>> > * I have a keyspace with RF=2; >>> > * The client read the table using LOCAL_ONE; >>> > * There is a batch job loading data into the tables using ALL. >>> > >>> > I want to change RF to 3 and both the client and the batch job use >>> LOCAL_QUORUM. >>> > >>> > My question is "Will the client still read the correct data when the >>> repair is running at the time my batch job loading is running too?" >>> > >>> > Or should I change to LOCAL_QUORUM first? >>> > >>> > Thanks. >>> >>> >> >> > >
Re: Is it safe to change RF in this situation?
Thanks. What about this situation: * Change RF 2 => 3 * Start repair * Roll back RF 3 => 2 * repair is still running I'm wondering what the repair is trying to do? The repair is trying to fix as RF=2 or still trying to fix like RF=3? On Thu, Sep 8, 2016 at 2:53 PM, Hannu Kröger wrote: > Yep, you can fix it by running repair or even faster by changing the > consistency level to local_quorum and deploying the new version of the app. > > Hannu > > On 8 Sep 2016, at 17:51, Benyi Wang wrote: > > Thanks Hannu, > > Unfortunately, we started changing RF from 2 to 3, and did see the empty > result rate is going higher. I assume that "If the LOCAL_ONE read hit the > new replica which is not there yet, the CQL query will return nothing." Is > my assumption correct? > > On Thu, Sep 8, 2016 at 11:49 AM, Hannu Kröger wrote: > >> Hi, >> >> If you change RF=2 -> 3 first, the LOCAL_ONE reads might hit the new >> replica which is not there yet. So I would change LOCAL_ONE -> LOCAL_QUORUM >> first and then change the RF and then run the repair. LOCAL_QUORUM is >> effectively ALL in your case (RF=2) if you have just one DC, so you can >> change the batch CL later. >> >> Cheers, >> Hannu >> >> > On 8 Sep 2016, at 14:42, Benyi Wang wrote: >> > >> > * I have a keyspace with RF=2; >> > * The client read the table using LOCAL_ONE; >> > * There is a batch job loading data into the tables using ALL. >> > >> > I want to change RF to 3 and both the client and the batch job use >> LOCAL_QUORUM. >> > >> > My question is "Will the client still read the correct data when the >> repair is running at the time my batch job loading is running too?" >> > >> > Or should I change to LOCAL_QUORUM first? >> > >> > Thanks. >> >> > >
Re: Is it safe to change RF in this situation?
Thanks Hannu, Unfortunately, we started changing RF from 2 to 3, and did see the empty result rate is going higher. I assume that "If the LOCAL_ONE read hit the new replica which is not there yet, the CQL query will return nothing." Is my assumption correct? On Thu, Sep 8, 2016 at 11:49 AM, Hannu Kröger wrote: > Hi, > > If you change RF=2 -> 3 first, the LOCAL_ONE reads might hit the new > replica which is not there yet. So I would change LOCAL_ONE -> LOCAL_QUORUM > first and then change the RF and then run the repair. LOCAL_QUORUM is > effectively ALL in your case (RF=2) if you have just one DC, so you can > change the batch CL later. > > Cheers, > Hannu > > > On 8 Sep 2016, at 14:42, Benyi Wang wrote: > > > > * I have a keyspace with RF=2; > > * The client read the table using LOCAL_ONE; > > * There is a batch job loading data into the tables using ALL. > > > > I want to change RF to 3 and both the client and the batch job use > LOCAL_QUORUM. > > > > My question is "Will the client still read the correct data when the > repair is running at the time my batch job loading is running too?" > > > > Or should I change to LOCAL_QUORUM first? > > > > Thanks. > >
Is it safe to change RF in this situation?
* I have a keyspace with RF=2; * The client read the table using LOCAL_ONE; * There is a batch job loading data into the tables using ALL. I want to change RF to 3 and both the client and the batch job use LOCAL_QUORUM. My question is "Will the client still read the correct data when the repair is running at the time my batch job loading is running too?" Or should I change to LOCAL_QUORUM first? Thanks.
Re: Client Read Latency is too high during repair
Never mind. I found the root cause. This has nothing to do with Cassandra and repair. Some web services called by the client caused the problem. On Fri, Aug 19, 2016 at 11:53 AM, Benyi Wang wrote: > I'm using cassandra java driver to access a small cassandra cluster > > * The cluster have 3 nodes in DC1 and 3 nodes in DC2 > * The keyspace is originally created in DC1 only with RF=2 > * The client had good read latency about 40 ms of 99 percentile under 100 > requests/sec (measured at the client side) > * Then keyspace is updated with 2-DC and RF=3 for each DC > * After the repair started (DBA started it, I don't exactly the command), > the client's read latency reached to 2 secs. > * The metric ClientRequest.read.latency.99percentile is still about 4ms > * There were two nodes having 3MB/sec outgoing streaming. > > I'm using Cassandra 2.1.8 and the read consistency is LOCAL_ONE. > > Can you point me some metrics to see what's the bottleneck? > > Thanks >
Client Read Latency is too high during repair
I'm using cassandra java driver to access a small cassandra cluster * The cluster have 3 nodes in DC1 and 3 nodes in DC2 * The keyspace is originally created in DC1 only with RF=2 * The client had good read latency about 40 ms of 99 percentile under 100 requests/sec (measured at the client side) * Then keyspace is updated with 2-DC and RF=3 for each DC * After the repair started (DBA started it, I don't exactly the command), the client's read latency reached to 2 secs. * The metric ClientRequest.read.latency.99percentile is still about 4ms * There were two nodes having 3MB/sec outgoing streaming. I'm using Cassandra 2.1.8 and the read consistency is LOCAL_ONE. Can you point me some metrics to see what's the bottleneck? Thanks
Re: How to tune Cassandra or Java Driver to get lower latency when there are a lot of writes?
Hi Ryan, As I said, saveToCassandra doesn't support "DELETE". This is why I modified the code of spark-cassandra-connector to allow me have DELETEs. What I change is how to bind a RDD row into a batch of CQL preparedStatements. On Fri, Sep 25, 2015 at 7:22 AM, Ryan Svihla wrote: > Why aren’t you using saveToCassandra ( > https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md)? > They have a number of locality aware optimizations that will probably > exceed your by hand bulk loading (especially if you’re not doing it inside > something like foreach partition). > > Also you can easily tune up and down the size of those tasks and therefore > batches to minimize harm on the prod system. > > On Sep 24, 2015, at 5:37 PM, Benyi Wang wrote: > > I use Spark and spark-cassandra-connector with a customized Cassandra > writer (spark-cassandra-connector doesn’t support DELETE). Basically the > writer works as follows: > >- Bind a row in Spark RDD with either INSERT/Delete PreparedStatement >- Create a BatchStatement for multiple rows >- Write to Cassandra. > > I knew using CQLBulkOutputFormat would be better, but it doesn't supports > DELETE. > > > On Thu, Sep 24, 2015 at 1:27 PM, Gerard Maas > wrote: > >> How are you loading the data? I mean, what insert method are you using? >> >> On Thu, Sep 24, 2015 at 9:58 PM, Benyi Wang >> wrote: >> >>> I have a cassandra cluster provides data to a web service. And there is >>> a daily batch load writing data into the cluster. >>> >>>- Without the batch loading, the service’s Latency 99thPercentile is >>>3ms. But during the load, it jumps to 90ms. >>>- I checked cassandra keyspace’s ReadLatency.99thPercentile, which >>>jumps to 1ms from 600 microsec. >>>- The service’s cassandra java driver request 99thPercentile was >>>90ms during the load >>> >>> The java driver took the most time. I knew the Cassandra servers are >>> busy in writing, but I want to know what kinds of metrics can identify >>> where is the bottleneck so that I can tune it. >>> >>> I’m using Cassandra 2.1.8 and Cassandra Java Driver 2.1.5. >>> >>> >> >> > > Regards, > > Ryan Svihla > >
Re: How to tune Cassandra or Java Driver to get lower latency when there are a lot of writes?
I use Spark and spark-cassandra-connector with a customized Cassandra writer (spark-cassandra-connector doesn’t support DELETE). Basically the writer works as follows: - Bind a row in Spark RDD with either INSERT/Delete PreparedStatement - Create a BatchStatement for multiple rows - Write to Cassandra. I knew using CQLBulkOutputFormat would be better, but it doesn't supports DELETE. On Thu, Sep 24, 2015 at 1:27 PM, Gerard Maas wrote: > How are you loading the data? I mean, what insert method are you using? > > On Thu, Sep 24, 2015 at 9:58 PM, Benyi Wang wrote: > >> I have a cassandra cluster provides data to a web service. And there is a >> daily batch load writing data into the cluster. >> >>- Without the batch loading, the service’s Latency 99thPercentile is >>3ms. But during the load, it jumps to 90ms. >>- I checked cassandra keyspace’s ReadLatency.99thPercentile, which >>jumps to 1ms from 600 microsec. >>- The service’s cassandra java driver request 99thPercentile was 90ms >>during the load >> >> The java driver took the most time. I knew the Cassandra servers are busy >> in writing, but I want to know what kinds of metrics can identify where is >> the bottleneck so that I can tune it. >> >> I’m using Cassandra 2.1.8 and Cassandra Java Driver 2.1.5. >> >> > >
How to tune Cassandra or Java Driver to get lower latency when there are a lot of writes?
I have a cassandra cluster provides data to a web service. And there is a daily batch load writing data into the cluster. - Without the batch loading, the service’s Latency 99thPercentile is 3ms. But during the load, it jumps to 90ms. - I checked cassandra keyspace’s ReadLatency.99thPercentile, which jumps to 1ms from 600 microsec. - The service’s cassandra java driver request 99thPercentile was 90ms during the load The java driver took the most time. I knew the Cassandra servers are busy in writing, but I want to know what kinds of metrics can identify where is the bottleneck so that I can tune it. I’m using Cassandra 2.1.8 and Cassandra Java Driver 2.1.5.
Will virtual nodes have worse performance?
I have a small cluster with 3 nodes and installed Cassandra 2.1.2 from DataStax YUM repository. I knew 2.1.2 is not recommended for production. The problem I observed is: - When I use vnode with num_token=256, the read latency is about 20ms for 50 percentile. - If I disable vnode, the read latency is about 1ms for 50 percentile. I'm wondering what is the root cause of the worse performance for vnode: - Is ver 2.1.2 the root cause? - num_token is too high for 3-node cluster? Thanks.
Re: How to stop "nodetool repair" in 2.1.2?
Using JMX worked. Thanks a lot. On Wed, Apr 15, 2015 at 3:57 PM, Robert Coli wrote: > On Wed, Apr 15, 2015 at 3:30 PM, Benyi Wang wrote: > >> It didn't work. I ran the command on all nodes, but I still can see the >> repair activities. >> > > Your input as an operator who wants a nodetool command to trivially stop > repairs is welcome here : > > https://issues.apache.org/jira/browse/CASSANDRA-3486 > > For now, your two options are : > > 1) restart all nodes participating in the repair > 2) access the JMX endpoint forceTerminateAllRepairSessions on all nodes > participating in the repair > > =Rob > >
Re: How to stop "nodetool repair" in 2.1.2?
It didn't work. I ran the command on all nodes, but I still can see the repair activities. On Wed, Apr 15, 2015 at 3:20 PM, Sebastian Estevez < sebastian.este...@datastax.com> wrote: > nodetool stop *VALIDATION* > On Apr 15, 2015 5:16 PM, "Benyi Wang" wrote: > >> I ran "nodetool repair -- keyspace table" for a table, and it is still >> running after 4 days. I knew there is an issue for repair with vnodes >> https://issues.apache.org/jira/browse/CASSANDRA-5220. My question is how >> I can kill this sequential repair? >> >> I killed the process which I ran the repair command. But I still can find >> the repair activities running on different nodes in OpsCenter. >> >> Is there a way I can stop the repair without restarting the nodes? >> >> Thanks. >> >
How to stop "nodetool repair" in 2.1.2?
I ran "nodetool repair -- keyspace table" for a table, and it is still running after 4 days. I knew there is an issue for repair with vnodes https://issues.apache.org/jira/browse/CASSANDRA-5220. My question is how I can kill this sequential repair? I killed the process which I ran the repair command. But I still can find the repair activities running on different nodes in OpsCenter. Is there a way I can stop the repair without restarting the nodes? Thanks.
Re: Do I need to run repair and compaction every node?
What about "incremental repair" and "sequential repair"? I ran "nodetool repair -- keyspace table" on one node. I found the repair sessions running on different nodes. Will this command repair the whole table? In this page: http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_repair_nodes_c.html#concept_ds_ebj_d3q_gk__opsRepairPrtRng *Using the nodetool repair -pr (–partitioner-range) option repairs only the first range returned by the partitioner for a node. Other replicas for that range still have to perform the Merkle tree calculation, causing a validation compaction.* Does it sound like -pr runs on one node? I'm still don't understand "the first range returned by the partitioned for a node"? On Mon, Apr 13, 2015 at 1:40 PM, Robert Coli wrote: > On Mon, Apr 13, 2015 at 1:36 PM, Benyi Wang wrote: > >> >>- I need to run compaction one each node, >> >> In general, there is no requirement to manually run compaction. Minor > compaction occurs in the background, automatically. > >> >>- To repair a table (column family), I only need to run repair on any >>of nodes. >> >> It depends on whether you are doing -pr or non -pr repair. > > If you are doing -pr repair, you run repair on all nodes. If you do non > -pr repair, you have to figure out what set of nodes to run it on. That's > why -pr exists, to simplify this. > > =Rob > >
Do I need to run repair and compaction every node?
I read the document for several times, but I still not quite sure how to run repair and compaction. To my understanding, - I need to run compaction one each node, - To repair a table (column family), I only need to run repair on any of nodes. Am I right? Thanks.
Re: Why select returns tombstoned results?
All servers are running ntpd. I guess the time should be synced across all servers. My dataset is too large to use sstable2json. It would take long time. I will try to repair to see if the issue is gone. On Tue, Mar 31, 2015 at 7:49 AM, Ken Hancock wrote: > Have you checked time sync across all servers? The fact that you've > changed consistency levels and you're getting different results may > indicate something inherently wrong with the cluster such as writes being > dropped or time differences between the nodes. > > A brute-force approach to better understand what's going on (especially if > you have an example of the wrong data being returned) is to do a > sstable2json on all your tables and simply grep for an example key. > > On Mon, Mar 30, 2015 at 4:39 PM, Benyi Wang wrote: > >> Thanks for replying. >> >> In cqlsh, if I change to Quorum (Consistency quorum), sometime the select >> return the deleted row, sometime not. >> >> I have two virtual data centers: service (3 nodes) and analytics(4 nodes >> collocate with Hadoop data nodes).The table has 3 replicas in service and 2 >> in analytics. When I wrote, I wrote into analytics using local_one. So I >> guest the data may not replicated to all nodes yet. >> >> I will try to use strong consistency for write. >> >> >> >> On Mon, Mar 30, 2015 at 11:59 AM, Prem Yadav >> wrote: >> >>> Increase the read CL to quorum and you should get correct results. >>> How many nodes do you have in the cluster and what is the replication >>> factor for the keyspace? >>> >>> On Mon, Mar 30, 2015 at 7:41 PM, Benyi Wang >>> wrote: >>> >>>> Create table tomb_test ( >>>>guid text, >>>>content text, >>>>range text, >>>>rank int, >>>>id text, >>>>cnt int >>>>primary key (guid, content, range, rank) >>>> ) >>>> >>>> Sometime I delete the rows using cassandra java driver using this query >>>> >>>> DELETE FROM tomb_test WHERE guid=? and content=? and range=? >>>> >>>> in Batch statement with UNLOGGED. CONSISTENCE_LEVEL is local_one. >>>> >>>> But if I run >>>> >>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and >>>> range='week' >>>> or >>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and >>>> range='week' and rank = 1 >>>> >>>> The result shows the deleted rows. >>>> >>>> If I run this select, the deleted rows are not shown >>>> >>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' >>>> >>>> If I run delete statement in cqlsh, the deleted rows won't show up. >>>> >>>> How can I fix this? >>>> >>>> >>> >> > > > -- > *Ken Hancock *| System Architect, Advanced Advertising > SeaChange International > 50 Nagog Park > Acton, Massachusetts 01720 > ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC > <http://www.schange.com/en-US/Company/InvestorRelations.aspx> > Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com > | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn] > <http://www.linkedin.com/in/kenhancock> > > [image: SeaChange International] > <http://www.schange.com/>This e-mail and any attachments may contain > information which is SeaChange International confidential. The information > enclosed is intended only for the addressees herein and may not be copied > or forwarded without permission from SeaChange International. >
Re: Why select returns tombstoned results?
Unfortunately I'm using 2.1.2. Is it possible that I downgrade to 2.0.13 without wiping out the data? I'm worrying about if there is a bug in 2.1.2. On Tue, Mar 31, 2015 at 4:37 AM, Paulo Ricardo Motta Gomes < paulo.mo...@chaordicsystems.com> wrote: > What version of Cassandra are you running? Are you by any chance running > repairs on your data? > > On Mon, Mar 30, 2015 at 5:39 PM, Benyi Wang wrote: > >> Thanks for replying. >> >> In cqlsh, if I change to Quorum (Consistency quorum), sometime the select >> return the deleted row, sometime not. >> >> I have two virtual data centers: service (3 nodes) and analytics(4 nodes >> collocate with Hadoop data nodes).The table has 3 replicas in service and 2 >> in analytics. When I wrote, I wrote into analytics using local_one. So I >> guest the data may not replicated to all nodes yet. >> >> I will try to use strong consistency for write. >> >> >> >> On Mon, Mar 30, 2015 at 11:59 AM, Prem Yadav >> wrote: >> >>> Increase the read CL to quorum and you should get correct results. >>> How many nodes do you have in the cluster and what is the replication >>> factor for the keyspace? >>> >>> On Mon, Mar 30, 2015 at 7:41 PM, Benyi Wang >>> wrote: >>> >>>> Create table tomb_test ( >>>>guid text, >>>>content text, >>>>range text, >>>>rank int, >>>>id text, >>>>cnt int >>>>primary key (guid, content, range, rank) >>>> ) >>>> >>>> Sometime I delete the rows using cassandra java driver using this query >>>> >>>> DELETE FROM tomb_test WHERE guid=? and content=? and range=? >>>> >>>> in Batch statement with UNLOGGED. CONSISTENCE_LEVEL is local_one. >>>> >>>> But if I run >>>> >>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and >>>> range='week' >>>> or >>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and >>>> range='week' and rank = 1 >>>> >>>> The result shows the deleted rows. >>>> >>>> If I run this select, the deleted rows are not shown >>>> >>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' >>>> >>>> If I run delete statement in cqlsh, the deleted rows won't show up. >>>> >>>> How can I fix this? >>>> >>>> >>> >> > > > -- > *Paulo Motta* > > Chaordic | *Platform* > *www.chaordic.com.br <http://www.chaordic.com.br/>* > +55 48 3232.3200 >
Re: Why select returns tombstoned results?
Thanks for replying. In cqlsh, if I change to Quorum (Consistency quorum), sometime the select return the deleted row, sometime not. I have two virtual data centers: service (3 nodes) and analytics(4 nodes collocate with Hadoop data nodes).The table has 3 replicas in service and 2 in analytics. When I wrote, I wrote into analytics using local_one. So I guest the data may not replicated to all nodes yet. I will try to use strong consistency for write. On Mon, Mar 30, 2015 at 11:59 AM, Prem Yadav wrote: > Increase the read CL to quorum and you should get correct results. > How many nodes do you have in the cluster and what is the replication > factor for the keyspace? > > On Mon, Mar 30, 2015 at 7:41 PM, Benyi Wang wrote: > >> Create table tomb_test ( >>guid text, >>content text, >>range text, >>rank int, >>id text, >>cnt int >>primary key (guid, content, range, rank) >> ) >> >> Sometime I delete the rows using cassandra java driver using this query >> >> DELETE FROM tomb_test WHERE guid=? and content=? and range=? >> >> in Batch statement with UNLOGGED. CONSISTENCE_LEVEL is local_one. >> >> But if I run >> >> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and >> range='week' >> or >> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and >> range='week' and rank = 1 >> >> The result shows the deleted rows. >> >> If I run this select, the deleted rows are not shown >> >> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' >> >> If I run delete statement in cqlsh, the deleted rows won't show up. >> >> How can I fix this? >> >> >
Why select returns tombstoned results?
Create table tomb_test ( guid text, content text, range text, rank int, id text, cnt int primary key (guid, content, range, rank) ) Sometime I delete the rows using cassandra java driver using this query DELETE FROM tomb_test WHERE guid=? and content=? and range=? in Batch statement with UNLOGGED. CONSISTENCE_LEVEL is local_one. But if I run SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and range='week' or SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and range='week' and rank = 1 The result shows the deleted rows. If I run this select, the deleted rows are not shown SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' If I run delete statement in cqlsh, the deleted rows won't show up. How can I fix this?
Delete columns
In C* 2.1.2, is there a way you can delete without specifying the row key? create table ( guid text, key1 text, key2 text, data int primary key (guid, key1, key2) ); delete from a_table where key1='' and key2=''; I'm trying to avoid doing like this: * query the table to get guids (32 bytes long) * send back delete queries like this delete from a_table where guid in (...) and key1='' and kye2=''. key1 and key2 only have 3~4 values, if I try to create multiple tables like table_kvi_kvj, it will be easy to delete, but results in the large dataset because of the duplicated guids. Because the CQL model will create a cassandra column family like guid, kv1-kv2, .., kvi-kvj, ..., kvn-kvm, ... Is there an API can drop columns in a column familty?
Re: How to bulkload into a specific data center?
On Fri, Jan 9, 2015 at 3:55 PM, Robert Coli wrote: > On Fri, Jan 9, 2015 at 11:38 AM, Benyi Wang wrote: > >> >>- Is it possible to modify SSTableLoader to allow it access one data >>center? >> >> Even if you only write to nodes in DC A, if you replicate that data to DC > B, it will have to travel over the WAN anyway? What are you trying to avoid? > > I'm lucky that those are virtual data centers in LAN. I just don't want to have a load burst in the "service" virtual data center because it may downgrade the REST service. I'm trying to load data into the "analytics" virtual data center, then let cassandra "slowly" replicates data into the "service" virtual data center. It is ok for the REST service to read some old data during the time of replication. I'm wondering if I should just use Throttle speed in Mbits to solve my problem? Because I may load ~100 million, I think spark-cassandra-connector might be >> too slow. I'm wondering if the methods "*Copy-the-sstables/”nodetool >> refresh” can be useful" in h*ttp:// >> www.pythian.com/blog/bulk-loading-options-for-cassandra/ will be a good >> choice. I'm still a newbie to Cassandra. I could not understand what the >> author said in that page. >> > > The author of that post is as wise as he is modest... ;D > > >> One of my question is: >> >> * When I run a spark job in yarn mode, the sstables are created into YARN >> working directory. >> * Assume I have a way to copy the files into the Cassandra directory on >> the same node. >> * Because the data are distributed across all analytics data center's >> nodes, each one has only a part of sstables, node A has part A, node B has >> part B. If I run refresh on each node, eventually node A has part A,B, and >> node B will have part A,B too. Am I right? >> > > I'm not sure I fully understand your question, but... > > In order to run refresh without having to immediately run cleanup, you > need to have SSTables which contain data only for ranges which the node you > are loading them on. > > So for a RF=3, N=3 cluster without vnodes (simple case), data is naturally > on every node. > > For RF=3, N=6 cluster A B C D E F, node C contains : > > - Third replica for A. > - Second replica for B. > - First replica for C. > > In order for you to generate the correct SSTable, you need to understand > all 3 replicas that should be there. With vnodes and nodes joining and > parting, this becomes more difficult. > > That's why people tend to use SSTableloader and the streaming interface : > with SSTableloader, Cassandra takes input which might live on any replica > and sends it to the appropriate nodes. > > =Rob > http://twitter.com/rcolidba > I'd better to stay at SSTableLoader. Thanks for your explanation.
Re: How to bulkload into a specific data center?
Hi Ryan, Thanks for your reply. Now I understood how SSTableLoader works. - If I understand correctly, the current o.a.c.io.sstable.SSTableLoader doesn't use LOCAL_ONE or LOCAL_QUORUM. Is it right? - Is it possible to modify SSTableLoader to allow it access one data center? Because I may load ~100 million, I think spark-cassandra-connector might be too slow. I'm wondering if the methods "*Copy-the-sstables/”nodetool refresh” can be useful" in h*ttp:// www.pythian.com/blog/bulk-loading-options-for-cassandra/ will be a good choice. I'm still a newbie to Cassandra. I could not understand what the author said in that page. One of my question is: * When I run a spark job in yarn mode, the sstables are created into YARN working directory. * Assume I have a way to copy the files into the Cassandra directory on the same node. * Because the data are distributed across all analytics data center's nodes, each one has only a part of sstables, node A has part A, node B has part B. If I run refresh on each node, eventually node A has part A,B, and node B will have part A,B too. Am I right? Thanks. On Thu, Jan 8, 2015 at 6:34 AM, Ryan Svihla wrote: > Just noticed you'd sent this to the dev list, this is a question for only > the user list, and please do not send questions of this type to the > developer list. > > On Thu, Jan 8, 2015 at 8:33 AM, Ryan Svihla wrote: > > > The nature of replication factor is such that writes will go wherever > > there is replication. If you're wanting responses to be faster, and not > > involve the REST data center in the spark job for response I suggest > using > > a cql driver and LOCAL_ONE or LOCAL_QUORUM consistency level (look at the > > spark cassandra connector here > > https://github.com/datastax/spark-cassandra-connector ) . While write > > traffic will still be replicated to the REST service data center, because > > you do want those results available, you will not be waiting on the > remote > > data center to respond "successful". > > > > Final point, bulk loading sends a copy per replica across the wire, so > > lets say you have RF3 in each data center that means bulk loading will > send > > out 6 copies from that client at once, with normal mutations via thrift > or > > cql writes between data centers go out as 1 copy, then that node will > > forward on to the other replicas. This means intra data center traffic in > > this case would be 3x more with the bulk loader than with using a > > traditional cql or thrift based client. > > > > > > > > On Wed, Jan 7, 2015 at 6:32 PM, Benyi Wang > wrote: > > > >> I set up two virtual data centers, one for analytics and one for REST > >> service. The analytics data center sits top on Hadoop cluster. I want to > >> bulk load my ETL results into the analytics data center so that the REST > >> service won't have the heavy load. I'm using CQLTableInputFormat in my > >> Spark Application, and I gave the nodes in analytics data center as > >> Intialial address. > >> > >> However, I found my jobs were connecting to the REST service data > center. > >> > >> How can I specify the data center? > >> > > > > > > > > -- > > > > Thanks, > > Ryan Svihla > > > > > > > -- > > Thanks, > Ryan Svihla >
How to bulkload into a specific data center?
I set up two virtual data centers, one for analytics and one for REST service. The analytics data center sits top on Hadoop cluster. I want to bulk load my ETL results into the analytics data center so that the REST service won't have the heavy load. I'm using CQLTableInputFormat in my Spark Application, and I gave the nodes in analytics data center as Intialial address. However, I found my jobs were connecting to the REST service data center. How can I specify the data center?
Is it possible to delete columns or row using CQLSSTableWriter?
CQLSSTableWriter only accepts an INSERT or UPDATE statement. I'm wondering whether make it accept DELETE statement. I need to update my cassandra table with a lot of data everyday. * I may need to delete a row (given the partition key) * I may need to delete some columns. For example, there are 20 rows for a primary key before loading, the new load may have 10 rows only. Because CQLSSTableWriter will write into a blank table, will DELETE put a tombstone in the table so that the row in the server will be deleted after bulk loading? Thanks.
Is it possible to flush memtable in one virtual center?
We have one ring and two virtual data centers in our Cassandra cluster? one is for Real-Time and the other is for analytics. My questions are: 1. Are there memtables in Analytics Data Center? To my understanding, it is true. 2. Is it possible to flush memtables if exist in Analytics Data Center only? I'm using Cassandra 1.0.7 for this cluster. Thanks.
What happens at server side when using SSTableLoader
Is there a page explaining what happens at server side when using SSTableLoader? I'm seeking the answers of the following questions: 1. What's about the existing data in the table? From my test, the data in sstable files will be applied to the existing data. Am I right? - The new data row or columns in sstable will be created - The existing columns will be updated. - The deleted row/column are also applied. 2. What the impact to read operations when you are bulk loading data? Thanks.