Dropped Mutation Messages in two DCs at different sites

2017-01-03 Thread Benyi Wang
I need to batch load a lot of data everyday into a keyspace across two DCs,
 one DC is at west coast and the other is at east coast.

I assume that the network delay between two DCs at different sites will
cause a lot of dropped mutation messages if I write too fast in LOCAL DC
using LOCAL_QUORUM.

I did this test: the test cluster has two DCs in one network at the same
site, but the configuration of the remote DC is lower than the local one.
When I used LOCAL_QUORUM and wrote fast enough, I observed a lot of dropped
mutation messages in the remote DC. So I guess the same thing will happen
if two DCs are at different sites.

To my understanding, the coordinator in the LOCAL dc will send write
requests to all copies including the remote copies, and return SUCCESS to
the client once the quorum of the copies in LOCAL dc respond. Due to the
network delay, the remote side will process the requests with a delay, and
new requests to the remote side arrive at the speed of LOCAL dc.
Eventually, the requests in the queue will exceed the timeout, and the
dropped mutation messages happen.

But I am not sure if my analysis is correct because the above analysis
doesn't consider that there are more connections than one DC situation and
if the network bandwidth slows down the process in LOCAL DC.

If my analysis is correct, the solution could be either slow down the batch
load speed, or configure remote side with longer timeout. My question is
how can I design some tests to find out how slow will be for the batch load
to avoid dropped mutation messages at the remote site.

If my analysis is wrong, could you explain what actually happens in this
situation?

Thanks.


Re: Is it safe to change RF in this situation?

2016-09-08 Thread Benyi Wang
Thanks a lot. Will do as you suggested.

On Thu, Sep 8, 2016 at 3:08 PM, Hannu Kröger  wrote:

> Ok, so I have to say that I’m not 100% sure how many replicas of data is
> it trying to maintain but it should not blow up (if repair crashes or
> something, it’s ok). So it should be safe to do.
>
> When the repair has run you can start with the plan I suggested and run
> repairs afterwards.
>
> Hannu
>
> On 8 Sep 2016, at 18:01, Benyi Wang  wrote:
>
> Thanks. What about this situation:
>
> * Change RF 2 => 3
> * Start repair
> * Roll back RF 3 => 2
> * repair is still running
>
> I'm wondering what the repair is trying to do? The repair is trying to fix
> as RF=2 or still trying to fix like RF=3?
>
> On Thu, Sep 8, 2016 at 2:53 PM, Hannu Kröger  wrote:
>
>> Yep, you can fix it by running repair or even faster by changing the
>> consistency level to local_quorum and deploying the new version of the app.
>>
>> Hannu
>>
>> On 8 Sep 2016, at 17:51, Benyi Wang  wrote:
>>
>> Thanks Hannu,
>>
>> Unfortunately, we started changing RF from 2 to 3, and did see the empty
>> result rate is going higher. I assume that  "If the LOCAL_ONE read hit the
>> new replica which is not there yet, the CQL query will return nothing." Is
>> my assumption correct?
>>
>> On Thu, Sep 8, 2016 at 11:49 AM, Hannu Kröger  wrote:
>>
>>> Hi,
>>>
>>> If you change RF=2 -> 3 first, the LOCAL_ONE reads might hit the new
>>> replica which is not there yet. So I would change LOCAL_ONE -> LOCAL_QUORUM
>>> first and then change the RF and then run the repair. LOCAL_QUORUM is
>>> effectively ALL in your case (RF=2) if you have just one DC, so you can
>>> change the batch CL later.
>>>
>>> Cheers,
>>> Hannu
>>>
>>> > On 8 Sep 2016, at 14:42, Benyi Wang  wrote:
>>> >
>>> > * I have a keyspace with RF=2;
>>> > * The client read the table using LOCAL_ONE;
>>> > * There is a batch job loading data into the tables using ALL.
>>> >
>>> > I want to change RF to 3 and both the client and the batch job use
>>> LOCAL_QUORUM.
>>> >
>>> > My question is "Will the client still read the correct data when the
>>> repair is running at the time my batch job loading is running too?"
>>> >
>>> > Or should I change to LOCAL_QUORUM first?
>>> >
>>> > Thanks.
>>>
>>>
>>
>>
>
>


Re: Is it safe to change RF in this situation?

2016-09-08 Thread Benyi Wang
Thanks. What about this situation:

* Change RF 2 => 3
* Start repair
* Roll back RF 3 => 2
* repair is still running

I'm wondering what the repair is trying to do? The repair is trying to fix
as RF=2 or still trying to fix like RF=3?

On Thu, Sep 8, 2016 at 2:53 PM, Hannu Kröger  wrote:

> Yep, you can fix it by running repair or even faster by changing the
> consistency level to local_quorum and deploying the new version of the app.
>
> Hannu
>
> On 8 Sep 2016, at 17:51, Benyi Wang  wrote:
>
> Thanks Hannu,
>
> Unfortunately, we started changing RF from 2 to 3, and did see the empty
> result rate is going higher. I assume that  "If the LOCAL_ONE read hit the
> new replica which is not there yet, the CQL query will return nothing." Is
> my assumption correct?
>
> On Thu, Sep 8, 2016 at 11:49 AM, Hannu Kröger  wrote:
>
>> Hi,
>>
>> If you change RF=2 -> 3 first, the LOCAL_ONE reads might hit the new
>> replica which is not there yet. So I would change LOCAL_ONE -> LOCAL_QUORUM
>> first and then change the RF and then run the repair. LOCAL_QUORUM is
>> effectively ALL in your case (RF=2) if you have just one DC, so you can
>> change the batch CL later.
>>
>> Cheers,
>> Hannu
>>
>> > On 8 Sep 2016, at 14:42, Benyi Wang  wrote:
>> >
>> > * I have a keyspace with RF=2;
>> > * The client read the table using LOCAL_ONE;
>> > * There is a batch job loading data into the tables using ALL.
>> >
>> > I want to change RF to 3 and both the client and the batch job use
>> LOCAL_QUORUM.
>> >
>> > My question is "Will the client still read the correct data when the
>> repair is running at the time my batch job loading is running too?"
>> >
>> > Or should I change to LOCAL_QUORUM first?
>> >
>> > Thanks.
>>
>>
>
>


Re: Is it safe to change RF in this situation?

2016-09-08 Thread Benyi Wang
Thanks Hannu,

Unfortunately, we started changing RF from 2 to 3, and did see the empty
result rate is going higher. I assume that  "If the LOCAL_ONE read hit the
new replica which is not there yet, the CQL query will return nothing." Is
my assumption correct?

On Thu, Sep 8, 2016 at 11:49 AM, Hannu Kröger  wrote:

> Hi,
>
> If you change RF=2 -> 3 first, the LOCAL_ONE reads might hit the new
> replica which is not there yet. So I would change LOCAL_ONE -> LOCAL_QUORUM
> first and then change the RF and then run the repair. LOCAL_QUORUM is
> effectively ALL in your case (RF=2) if you have just one DC, so you can
> change the batch CL later.
>
> Cheers,
> Hannu
>
> > On 8 Sep 2016, at 14:42, Benyi Wang  wrote:
> >
> > * I have a keyspace with RF=2;
> > * The client read the table using LOCAL_ONE;
> > * There is a batch job loading data into the tables using ALL.
> >
> > I want to change RF to 3 and both the client and the batch job use
> LOCAL_QUORUM.
> >
> > My question is "Will the client still read the correct data when the
> repair is running at the time my batch job loading is running too?"
> >
> > Or should I change to LOCAL_QUORUM first?
> >
> > Thanks.
>
>


Is it safe to change RF in this situation?

2016-09-08 Thread Benyi Wang
* I have a keyspace with RF=2;
* The client read the table using LOCAL_ONE;
* There is a batch job loading data into the tables using ALL.

I want to change RF to 3 and both the client and the batch job use
LOCAL_QUORUM.

My question is "Will the client still read the correct data when the repair
is running at the time my batch job loading is running too?"

Or should I change to LOCAL_QUORUM first?

Thanks.


Re: Client Read Latency is too high during repair

2016-08-19 Thread Benyi Wang
Never mind. I found the root cause. This has nothing to do with Cassandra
and repair. Some web services called by the client caused the problem.

On Fri, Aug 19, 2016 at 11:53 AM, Benyi Wang  wrote:

> I'm using cassandra java driver to access a small cassandra cluster
>
> * The cluster have 3 nodes in DC1 and 3 nodes in DC2
> * The keyspace is originally created in DC1 only with RF=2
> * The client had good read latency about 40 ms of 99 percentile under 100
> requests/sec (measured at the client side)
> * Then keyspace is updated with 2-DC and RF=3 for each DC
> * After the repair started (DBA started it, I don't exactly the command),
> the client's read latency reached to 2 secs.
> * The metric ClientRequest.read.latency.99percentile is still about 4ms
> * There were two nodes having 3MB/sec outgoing streaming.
>
> I'm using Cassandra 2.1.8 and the read consistency is LOCAL_ONE.
>
> Can you point me some metrics to see what's the bottleneck?
>
> Thanks
>


Client Read Latency is too high during repair

2016-08-19 Thread Benyi Wang
I'm using cassandra java driver to access a small cassandra cluster

* The cluster have 3 nodes in DC1 and 3 nodes in DC2
* The keyspace is originally created in DC1 only with RF=2
* The client had good read latency about 40 ms of 99 percentile under 100
requests/sec (measured at the client side)
* Then keyspace is updated with 2-DC and RF=3 for each DC
* After the repair started (DBA started it, I don't exactly the command),
the client's read latency reached to 2 secs.
* The metric ClientRequest.read.latency.99percentile is still about 4ms
* There were two nodes having 3MB/sec outgoing streaming.

I'm using Cassandra 2.1.8 and the read consistency is LOCAL_ONE.

Can you point me some metrics to see what's the bottleneck?

Thanks


Re: How to tune Cassandra or Java Driver to get lower latency when there are a lot of writes?

2015-09-25 Thread Benyi Wang
Hi Ryan,

As I said, saveToCassandra doesn't support "DELETE". This is why I modified
the code of spark-cassandra-connector to allow me have DELETEs. What I
change is how to bind a RDD row into a batch of CQL preparedStatements.



On Fri, Sep 25, 2015 at 7:22 AM, Ryan Svihla  wrote:

> Why aren’t you using saveToCassandra (
> https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md)?
> They have a number of locality aware optimizations that will probably
> exceed your by hand bulk loading (especially if you’re not doing it inside
> something like foreach partition).
>
> Also you can easily tune up and down the size of those tasks and therefore
> batches to minimize harm on the prod system.
>
> On Sep 24, 2015, at 5:37 PM, Benyi Wang  wrote:
>
> I use Spark and spark-cassandra-connector with a customized Cassandra
> writer (spark-cassandra-connector doesn’t support DELETE). Basically the
> writer works as follows:
>
>- Bind a row in Spark RDD with either INSERT/Delete PreparedStatement
>- Create a BatchStatement for multiple rows
>- Write to Cassandra.
>
> I knew using CQLBulkOutputFormat would be better, but it doesn't supports
> DELETE.
> ​
>
> On Thu, Sep 24, 2015 at 1:27 PM, Gerard Maas 
> wrote:
>
>> How are you loading the data? I mean, what insert method are you using?
>>
>> On Thu, Sep 24, 2015 at 9:58 PM, Benyi Wang 
>> wrote:
>>
>>> I have a cassandra cluster provides data to a web service. And there is
>>> a daily batch load writing data into the cluster.
>>>
>>>- Without the batch loading, the service’s Latency 99thPercentile is
>>>3ms. But during the load, it jumps to 90ms.
>>>- I checked cassandra keyspace’s ReadLatency.99thPercentile, which
>>>jumps to 1ms from 600 microsec.
>>>- The service’s cassandra java driver request 99thPercentile was
>>>90ms during the load
>>>
>>> The java driver took the most time. I knew the Cassandra servers are
>>> busy in writing, but I want to know what kinds of metrics can identify
>>> where is the bottleneck so that I can tune it.
>>>
>>> I’m using Cassandra 2.1.8 and Cassandra Java Driver 2.1.5.
>>> ​
>>>
>>
>>
>
> Regards,
>
> Ryan Svihla
>
>


Re: How to tune Cassandra or Java Driver to get lower latency when there are a lot of writes?

2015-09-24 Thread Benyi Wang
I use Spark and spark-cassandra-connector with a customized Cassandra
writer (spark-cassandra-connector doesn’t support DELETE). Basically the
writer works as follows:

   - Bind a row in Spark RDD with either INSERT/Delete PreparedStatement
   - Create a BatchStatement for multiple rows
   - Write to Cassandra.

I knew using CQLBulkOutputFormat would be better, but it doesn't supports
DELETE.
​

On Thu, Sep 24, 2015 at 1:27 PM, Gerard Maas  wrote:

> How are you loading the data? I mean, what insert method are you using?
>
> On Thu, Sep 24, 2015 at 9:58 PM, Benyi Wang  wrote:
>
>> I have a cassandra cluster provides data to a web service. And there is a
>> daily batch load writing data into the cluster.
>>
>>- Without the batch loading, the service’s Latency 99thPercentile is
>>3ms. But during the load, it jumps to 90ms.
>>- I checked cassandra keyspace’s ReadLatency.99thPercentile, which
>>jumps to 1ms from 600 microsec.
>>- The service’s cassandra java driver request 99thPercentile was 90ms
>>during the load
>>
>> The java driver took the most time. I knew the Cassandra servers are busy
>> in writing, but I want to know what kinds of metrics can identify where is
>> the bottleneck so that I can tune it.
>>
>> I’m using Cassandra 2.1.8 and Cassandra Java Driver 2.1.5.
>> ​
>>
>
>


How to tune Cassandra or Java Driver to get lower latency when there are a lot of writes?

2015-09-24 Thread Benyi Wang
I have a cassandra cluster provides data to a web service. And there is a
daily batch load writing data into the cluster.

   - Without the batch loading, the service’s Latency 99thPercentile is
   3ms. But during the load, it jumps to 90ms.
   - I checked cassandra keyspace’s ReadLatency.99thPercentile, which jumps
   to 1ms from 600 microsec.
   - The service’s cassandra java driver request 99thPercentile was 90ms
   during the load

The java driver took the most time. I knew the Cassandra servers are busy
in writing, but I want to know what kinds of metrics can identify where is
the bottleneck so that I can tune it.

I’m using Cassandra 2.1.8 and Cassandra Java Driver 2.1.5.
​


Will virtual nodes have worse performance?

2015-07-15 Thread Benyi Wang
I have a small cluster with 3 nodes and installed Cassandra 2.1.2 from
DataStax YUM repository. I knew 2.1.2 is not recommended for production.

The problem I observed is:

   - When I use vnode with num_token=256, the read latency is about 20ms
   for 50 percentile.
   - If I disable vnode, the read latency is about 1ms for 50 percentile.

I'm wondering what is the root cause of the worse performance for vnode:

   - Is ver 2.1.2 the root cause?
   - num_token is too high for 3-node cluster?

Thanks.


Re: How to stop "nodetool repair" in 2.1.2?

2015-04-15 Thread Benyi Wang
Using JMX worked. Thanks a lot.

On Wed, Apr 15, 2015 at 3:57 PM, Robert Coli  wrote:

> On Wed, Apr 15, 2015 at 3:30 PM, Benyi Wang  wrote:
>
>> It didn't work. I ran the command on all nodes, but I still can see the
>> repair activities.
>>
>
> Your input as an operator who wants a nodetool command to trivially stop
> repairs is welcome here :
>
> https://issues.apache.org/jira/browse/CASSANDRA-3486
>
> For now, your two options are :
>
> 1) restart all nodes participating in the repair
> 2) access the JMX endpoint forceTerminateAllRepairSessions on all nodes
> participating in the repair
>
> =Rob
>
>


Re: How to stop "nodetool repair" in 2.1.2?

2015-04-15 Thread Benyi Wang
It didn't work. I ran the command on all nodes, but I still can see the
repair activities.

On Wed, Apr 15, 2015 at 3:20 PM, Sebastian Estevez <
sebastian.este...@datastax.com> wrote:

> nodetool stop *VALIDATION*
> On Apr 15, 2015 5:16 PM, "Benyi Wang"  wrote:
>
>> I ran "nodetool repair -- keyspace table" for a table, and it is still
>> running after 4 days. I knew there is an issue for repair with vnodes
>> https://issues.apache.org/jira/browse/CASSANDRA-5220. My question is how
>> I can kill this sequential repair?
>>
>> I killed the process which I ran the repair command. But I still can find
>> the repair activities running on different nodes in OpsCenter.
>>
>> Is there a way I can stop the repair without restarting the nodes?
>>
>> Thanks.
>>
>


How to stop "nodetool repair" in 2.1.2?

2015-04-15 Thread Benyi Wang
I ran "nodetool repair -- keyspace table" for a table, and it is still
running after 4 days. I knew there is an issue for repair with vnodes
https://issues.apache.org/jira/browse/CASSANDRA-5220. My question is how I
can kill this sequential repair?

I killed the process which I ran the repair command. But I still can find
the repair activities running on different nodes in OpsCenter.

Is there a way I can stop the repair without restarting the nodes?

Thanks.


Re: Do I need to run repair and compaction every node?

2015-04-13 Thread Benyi Wang
What about "incremental repair" and "sequential repair"?

I ran "nodetool repair -- keyspace table" on one node. I found the repair
sessions running on different nodes. Will this command repair the whole
table?

In this page:
http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_repair_nodes_c.html#concept_ds_ebj_d3q_gk__opsRepairPrtRng

*Using the nodetool repair -pr (–partitioner-range) option repairs only the
first range returned by the partitioner for a node. Other replicas for that
range still have to perform the Merkle tree calculation, causing a
validation compaction.*

Does it sound like -pr runs on one node?
I'm still don't understand "the first range returned by the partitioned for
a node"?

On Mon, Apr 13, 2015 at 1:40 PM, Robert Coli  wrote:

> On Mon, Apr 13, 2015 at 1:36 PM, Benyi Wang  wrote:
>
>>
>>- I need to run compaction one each node,
>>
>> In general, there is no requirement to manually run compaction. Minor
> compaction occurs in the background, automatically.
>
>>
>>- To repair a table (column family), I only need to run repair on any
>>of nodes.
>>
>> It depends on whether you are doing -pr or non -pr repair.
>
> If you are doing -pr repair, you run repair on all nodes. If you do non
> -pr repair, you have to figure out what set of nodes to run it on. That's
> why -pr exists, to simplify this.
>
> =Rob
>
>


Do I need to run repair and compaction every node?

2015-04-13 Thread Benyi Wang
I read the document for several times, but I still not quite sure how to
run repair and compaction.

To my understanding,

   - I need to run compaction one each node,
   - To repair a table (column family), I only need to run repair on any of
   nodes.

Am I right?

Thanks.


Re: Why select returns tombstoned results?

2015-04-01 Thread Benyi Wang
All servers are running ntpd. I guess the time should be synced across all
servers.

My dataset is too large to use sstable2json. It would take long time.

I will try to repair to see if the issue is gone.

On Tue, Mar 31, 2015 at 7:49 AM, Ken Hancock 
wrote:

> Have you checked time sync across all servers?  The fact that you've
> changed consistency levels and you're getting different results may
> indicate something inherently wrong with the cluster such as writes being
> dropped or time differences between the nodes.
>
> A brute-force approach to better understand what's going on (especially if
> you have an example of the wrong data being returned) is to do a
> sstable2json on all your tables and simply grep for an example key.
>
> On Mon, Mar 30, 2015 at 4:39 PM, Benyi Wang  wrote:
>
>> Thanks for replying.
>>
>> In cqlsh, if I change to Quorum (Consistency quorum), sometime the select
>> return the deleted row, sometime not.
>>
>> I have two virtual data centers: service (3 nodes) and analytics(4 nodes
>> collocate with Hadoop data nodes).The table has 3 replicas in service and 2
>> in analytics. When I wrote, I wrote into analytics using local_one. So I
>> guest the data may not replicated to all nodes yet.
>>
>> I will try to use strong consistency for write.
>>
>>
>>
>> On Mon, Mar 30, 2015 at 11:59 AM, Prem Yadav 
>> wrote:
>>
>>> Increase the read CL to quorum and you should get correct results.
>>> How many nodes do you have in the cluster and what is the replication
>>> factor for the keyspace?
>>>
>>> On Mon, Mar 30, 2015 at 7:41 PM, Benyi Wang 
>>> wrote:
>>>
>>>> Create table tomb_test (
>>>>guid text,
>>>>content text,
>>>>range text,
>>>>rank int,
>>>>id text,
>>>>cnt int
>>>>primary key (guid, content, range, rank)
>>>> )
>>>>
>>>> Sometime I delete the rows using cassandra java driver using this query
>>>>
>>>> DELETE FROM tomb_test WHERE guid=? and content=? and range=?
>>>>
>>>> in Batch statement with UNLOGGED. CONSISTENCE_LEVEL is local_one.
>>>>
>>>> But if I run
>>>>
>>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and
>>>> range='week'
>>>> or
>>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and
>>>> range='week' and rank = 1
>>>>
>>>> The result shows the deleted rows.
>>>>
>>>> If I run this select, the deleted rows are not shown
>>>>
>>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1'
>>>>
>>>> If I run delete statement in cqlsh, the deleted rows won't show up.
>>>>
>>>> How can I fix this?
>>>>
>>>>
>>>
>>
>
>
> --
> *Ken Hancock *| System Architect, Advanced Advertising
> SeaChange International
> 50 Nagog Park
> Acton, Massachusetts 01720
> ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC
> <http://www.schange.com/en-US/Company/InvestorRelations.aspx>
> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com
>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]
> <http://www.linkedin.com/in/kenhancock>
>
> [image: SeaChange International]
> <http://www.schange.com/>This e-mail and any attachments may contain
> information which is SeaChange International confidential. The information
> enclosed is intended only for the addressees herein and may not be copied
> or forwarded without permission from SeaChange International.
>


Re: Why select returns tombstoned results?

2015-04-01 Thread Benyi Wang
Unfortunately I'm using 2.1.2. Is it possible that I downgrade to 2.0.13
without wiping out the data? I'm worrying about if there is a bug in 2.1.2.


On Tue, Mar 31, 2015 at 4:37 AM, Paulo Ricardo Motta Gomes <
paulo.mo...@chaordicsystems.com> wrote:

>  What version of Cassandra are you running? Are you by any chance running
> repairs on your data?
>
> On Mon, Mar 30, 2015 at 5:39 PM, Benyi Wang  wrote:
>
>> Thanks for replying.
>>
>> In cqlsh, if I change to Quorum (Consistency quorum), sometime the select
>> return the deleted row, sometime not.
>>
>> I have two virtual data centers: service (3 nodes) and analytics(4 nodes
>> collocate with Hadoop data nodes).The table has 3 replicas in service and 2
>> in analytics. When I wrote, I wrote into analytics using local_one. So I
>> guest the data may not replicated to all nodes yet.
>>
>> I will try to use strong consistency for write.
>>
>>
>>
>> On Mon, Mar 30, 2015 at 11:59 AM, Prem Yadav 
>> wrote:
>>
>>> Increase the read CL to quorum and you should get correct results.
>>> How many nodes do you have in the cluster and what is the replication
>>> factor for the keyspace?
>>>
>>> On Mon, Mar 30, 2015 at 7:41 PM, Benyi Wang 
>>> wrote:
>>>
>>>> Create table tomb_test (
>>>>guid text,
>>>>content text,
>>>>range text,
>>>>rank int,
>>>>id text,
>>>>cnt int
>>>>primary key (guid, content, range, rank)
>>>> )
>>>>
>>>> Sometime I delete the rows using cassandra java driver using this query
>>>>
>>>> DELETE FROM tomb_test WHERE guid=? and content=? and range=?
>>>>
>>>> in Batch statement with UNLOGGED. CONSISTENCE_LEVEL is local_one.
>>>>
>>>> But if I run
>>>>
>>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and
>>>> range='week'
>>>> or
>>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and
>>>> range='week' and rank = 1
>>>>
>>>> The result shows the deleted rows.
>>>>
>>>> If I run this select, the deleted rows are not shown
>>>>
>>>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1'
>>>>
>>>> If I run delete statement in cqlsh, the deleted rows won't show up.
>>>>
>>>> How can I fix this?
>>>>
>>>>
>>>
>>
>
>
> --
> *Paulo Motta*
>
> Chaordic | *Platform*
> *www.chaordic.com.br <http://www.chaordic.com.br/>*
> +55 48 3232.3200
>


Re: Why select returns tombstoned results?

2015-03-30 Thread Benyi Wang
Thanks for replying.

In cqlsh, if I change to Quorum (Consistency quorum), sometime the select
return the deleted row, sometime not.

I have two virtual data centers: service (3 nodes) and analytics(4 nodes
collocate with Hadoop data nodes).The table has 3 replicas in service and 2
in analytics. When I wrote, I wrote into analytics using local_one. So I
guest the data may not replicated to all nodes yet.

I will try to use strong consistency for write.



On Mon, Mar 30, 2015 at 11:59 AM, Prem Yadav  wrote:

> Increase the read CL to quorum and you should get correct results.
> How many nodes do you have in the cluster and what is the replication
> factor for the keyspace?
>
> On Mon, Mar 30, 2015 at 7:41 PM, Benyi Wang  wrote:
>
>> Create table tomb_test (
>>guid text,
>>content text,
>>range text,
>>rank int,
>>id text,
>>cnt int
>>primary key (guid, content, range, rank)
>> )
>>
>> Sometime I delete the rows using cassandra java driver using this query
>>
>> DELETE FROM tomb_test WHERE guid=? and content=? and range=?
>>
>> in Batch statement with UNLOGGED. CONSISTENCE_LEVEL is local_one.
>>
>> But if I run
>>
>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and
>> range='week'
>> or
>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and
>> range='week' and rank = 1
>>
>> The result shows the deleted rows.
>>
>> If I run this select, the deleted rows are not shown
>>
>> SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1'
>>
>> If I run delete statement in cqlsh, the deleted rows won't show up.
>>
>> How can I fix this?
>>
>>
>


Why select returns tombstoned results?

2015-03-30 Thread Benyi Wang
Create table tomb_test (
   guid text,
   content text,
   range text,
   rank int,
   id text,
   cnt int
   primary key (guid, content, range, rank)
)

Sometime I delete the rows using cassandra java driver using this query

DELETE FROM tomb_test WHERE guid=? and content=? and range=?

in Batch statement with UNLOGGED. CONSISTENCE_LEVEL is local_one.

But if I run

SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and
range='week'
or
SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1' and
range='week' and rank = 1

The result shows the deleted rows.

If I run this select, the deleted rows are not shown

SELECT * FROM tomb_test WHERE guid='guid-1' and content='content-1'

If I run delete statement in cqlsh, the deleted rows won't show up.

How can I fix this?


Delete columns

2015-02-27 Thread Benyi Wang
In C* 2.1.2, is there a way you can delete without specifying the row key?

create table (
  guid text,
  key1 text,
  key2 text,
  data int
  primary key (guid, key1, key2)
);

delete from a_table where key1='' and key2='';

I'm trying to avoid doing like this:
* query the table to get guids (32 bytes long)
* send back delete queries like this

delete from a_table where guid in (...) and key1='' and kye2=''.

key1 and key2 only have 3~4 values, if I try to create multiple tables like
table_kvi_kvj, it will be easy to delete, but results in the large dataset
because of the duplicated guids.

Because the CQL model will create a cassandra column family like

guid, kv1-kv2, .., kvi-kvj, ..., kvn-kvm, ...

Is there an API can drop columns in a column familty?


Re: How to bulkload into a specific data center?

2015-01-10 Thread Benyi Wang
On Fri, Jan 9, 2015 at 3:55 PM, Robert Coli  wrote:

> On Fri, Jan 9, 2015 at 11:38 AM, Benyi Wang  wrote:
>
>>
>>- Is it possible to modify SSTableLoader to allow it access one data
>>center?
>>
>> Even if you only write to nodes in DC A, if you replicate that data to DC
> B, it will have to travel over the WAN anyway? What are you trying to avoid?
>
>

I'm lucky that those are virtual data centers in LAN.

I just don't want to have a load burst in the "service" virtual data center
because it may downgrade the REST service. I'm trying to load data into the
"analytics" virtual data center, then let cassandra "slowly" replicates
data into the "service" virtual data center. It is ok for the REST service
to read some old data during the time of replication.

I'm wondering if I should just use Throttle speed in Mbits to solve my
problem?

Because I may load ~100 million, I think spark-cassandra-connector might be
>> too slow. I'm wondering if the methods "*Copy-the-sstables/”nodetool
>> refresh” can be useful" in h*ttp://
>> www.pythian.com/blog/bulk-loading-options-for-cassandra/ will be a good
>> choice. I'm still a newbie to Cassandra. I could not understand what the
>> author said in that page.
>>
>
> The author of that post is as wise as he is modest... ;D
>
>
>> One of my question is:
>>
>> * When I run a spark job in yarn mode, the sstables are created into YARN
>> working directory.
>> * Assume I have a way to copy the files into the Cassandra directory on
>> the same node.
>> * Because the data are distributed across all analytics data center's
>> nodes, each one has only a part of sstables, node A has part A, node B has
>> part B. If I run refresh on each node, eventually node A has part A,B, and
>> node B will have part A,B too. Am I right?
>>
>
> I'm not sure I fully understand your question, but...
>
> In order to run refresh without having to immediately run cleanup, you
> need to have SSTables which contain data only for ranges which the node you
> are loading them on.
>
> So for a RF=3, N=3 cluster without vnodes (simple case), data is naturally
> on every node.
>
> For RF=3, N=6 cluster A B C D E F, node C contains :
>
> - Third replica for A.
> - Second replica for B.
> - First replica for C.
>
> In order for you to generate the correct SSTable, you need to understand
> all 3 replicas that should be there. With vnodes and nodes joining and
> parting, this becomes more difficult.
>
> That's why people tend to use SSTableloader and the streaming interface :
> with SSTableloader, Cassandra takes input which might live on any replica
> and sends it to the appropriate nodes.
>
> =Rob
> http://twitter.com/rcolidba
>

I'd better to stay at SSTableLoader. Thanks for your explanation.


Re: How to bulkload into a specific data center?

2015-01-09 Thread Benyi Wang
Hi Ryan,

Thanks for your reply. Now I understood how SSTableLoader works.

   - If I understand correctly, the current o.a.c.io.sstable.SSTableLoader
   doesn't use LOCAL_ONE or LOCAL_QUORUM. Is it right?
   - Is it possible to modify SSTableLoader to allow it access one data
   center?

Because I may load ~100 million, I think spark-cassandra-connector might be
too slow. I'm wondering if the methods "*Copy-the-sstables/”nodetool
refresh” can be useful" in h*ttp://
www.pythian.com/blog/bulk-loading-options-for-cassandra/ will be a good
choice. I'm still a newbie to Cassandra. I could not understand what the
author said in that page. One of my question is:

* When I run a spark job in yarn mode, the sstables are created into YARN
working directory.
* Assume I have a way to copy the files into the Cassandra directory on the
same node.
* Because the data are distributed across all analytics data center's
nodes, each one has only a part of sstables, node A has part A, node B has
part B. If I run refresh on each node, eventually node A has part A,B, and
node B will have part A,B too. Am I right?

Thanks.

On Thu, Jan 8, 2015 at 6:34 AM, Ryan Svihla  wrote:

> Just noticed you'd sent this to the dev list, this is a question for only
> the user list, and please do not send questions of this type to the
> developer list.
>
> On Thu, Jan 8, 2015 at 8:33 AM, Ryan Svihla  wrote:
>
> > The nature of replication factor is such that writes will go wherever
> > there is replication. If you're wanting responses to be faster, and not
> > involve the REST data center in the spark job for response I suggest
> using
> > a cql driver and LOCAL_ONE or LOCAL_QUORUM consistency level (look at the
> > spark cassandra connector here
> > https://github.com/datastax/spark-cassandra-connector ) . While write
> > traffic will still be replicated to the REST service data center, because
> > you do want those results available, you will not be waiting on the
> remote
> > data center to respond "successful".
> >
> > Final point, bulk loading sends a copy per replica across the wire, so
> > lets say you have RF3 in each data center that means bulk loading will
> send
> > out 6 copies from that client at once, with normal mutations via thrift
> or
> > cql writes between data centers go out as 1 copy, then that node will
> > forward on to the other replicas. This means intra data center traffic in
> > this case would be 3x more with the bulk loader than with using a
> > traditional cql or thrift based client.
> >
> >
> >
> > On Wed, Jan 7, 2015 at 6:32 PM, Benyi Wang 
> wrote:
> >
> >> I set up two virtual data centers, one for analytics and one for REST
> >> service. The analytics data center sits top on Hadoop cluster. I want to
> >> bulk load my ETL results into the analytics data center so that the REST
> >> service won't have the heavy load. I'm using CQLTableInputFormat in my
> >> Spark Application, and I gave the nodes in analytics data center as
> >> Intialial address.
> >>
> >> However, I found my jobs were connecting to the REST service data
> center.
> >>
> >> How can I specify the data center?
> >>
> >
> >
> >
> > --
> >
> > Thanks,
> > Ryan Svihla
> >
> >
>
>
> --
>
> Thanks,
> Ryan Svihla
>


How to bulkload into a specific data center?

2015-01-07 Thread Benyi Wang
I set up two virtual data centers, one for analytics and one for REST
service. The analytics data center sits top on Hadoop cluster. I want to
bulk load my ETL results into the analytics data center so that the REST
service won't have the heavy load. I'm using CQLTableInputFormat in my
Spark Application, and I gave the nodes in analytics data center as
Intialial address.

However, I found my jobs were connecting to the REST service data center.

How can I specify the data center?


Is it possible to delete columns or row using CQLSSTableWriter?

2015-01-07 Thread Benyi Wang
CQLSSTableWriter only accepts an INSERT or UPDATE statement. I'm wondering
whether make it accept DELETE statement.

I need to update my cassandra table with a lot of data everyday.

* I may need to delete a row (given the partition key)
* I may need to delete some columns. For example, there are 20 rows for a
primary key before loading, the new load may have 10 rows only.

Because CQLSSTableWriter will write into a blank table, will DELETE put a
tombstone in the table so that the row in the server will be deleted after
bulk loading?

Thanks.


Is it possible to flush memtable in one virtual center?

2014-12-15 Thread Benyi Wang
We have one ring and two virtual data centers in our Cassandra cluster? one
is for Real-Time and the other is for analytics. My questions are:

   1. Are there memtables in Analytics Data Center? To my understanding, it
   is true.
   2. Is it possible to flush memtables if exist in Analytics Data Center
   only?

I'm using Cassandra 1.0.7 for this cluster.

Thanks.


What happens at server side when using SSTableLoader

2014-12-01 Thread Benyi Wang
Is there a page explaining what happens at server side when using
SSTableLoader?

I'm seeking the answers of the following questions:

   1. What's about the existing data in the table? From my test, the data
   in sstable files will be applied to the existing data. Am I right?
  - The new data row or columns in sstable will be created
  - The existing columns will be updated.
  - The deleted row/column are also applied.
   2. What the impact to read operations when you are bulk loading data?

Thanks.