I need to batch load a lot of data everyday into a keyspace across two DCs,
one DC is at west coast and the other is at east coast.
I assume that the network delay between two DCs at different sites will
cause a lot of dropped mutation messages if I write too fast in LOCAL DC
using LOCAL_QUORUM.
). So it should be safe to do.
>
> When the repair has run you can start with the plan I suggested and run
> repairs afterwards.
>
> Hannu
>
> On 8 Sep 2016, at 18:01, Benyi Wang <bewang.t...@gmail.com> wrote:
>
> Thanks. What about this situation:
>
> * Cha
<hkro...@gmail.com> wrote:
> Yep, you can fix it by running repair or even faster by changing the
> consistency level to local_quorum and deploying the new version of the app.
>
> Hannu
>
> On 8 Sep 2016, at 17:51, Benyi Wang <bewang.t...@gmail.com> wrote:
>
> Thanks Han
RUM is
> effectively ALL in your case (RF=2) if you have just one DC, so you can
> change the batch CL later.
>
> Cheers,
> Hannu
>
> > On 8 Sep 2016, at 14:42, Benyi Wang <bewang.t...@gmail.com> wrote:
> >
> > * I have a keyspace with RF=2;
> >
* I have a keyspace with RF=2;
* The client read the table using LOCAL_ONE;
* There is a batch job loading data into the tables using ALL.
I want to change RF to 3 and both the client and the batch job use
LOCAL_QUORUM.
My question is "Will the client still read the correct data when the repair
Never mind. I found the root cause. This has nothing to do with Cassandra
and repair. Some web services called by the client caused the problem.
On Fri, Aug 19, 2016 at 11:53 AM, Benyi Wang <bewang.t...@gmail.com> wrote:
> I'm using cassandra java driver to access a small cassandr
I'm using cassandra java driver to access a small cassandra cluster
* The cluster have 3 nodes in DC1 and 3 nodes in DC2
* The keyspace is originally created in DC1 only with RF=2
* The client had good read latency about 40 ms of 99 percentile under 100
requests/sec (measured at the client side)
g it inside
> something like foreach partition).
>
> Also you can easily tune up and down the size of those tasks and therefore
> batches to minimize harm on the prod system.
>
> On Sep 24, 2015, at 5:37 PM, Benyi Wang <bewang.t...@gmail.com> wrote:
>
> I use Spark
I have a cassandra cluster provides data to a web service. And there is a
daily batch load writing data into the cluster.
- Without the batch loading, the service’s Latency 99thPercentile is
3ms. But during the load, it jumps to 90ms.
- I checked cassandra keyspace’s
- Write to Cassandra.
I knew using CQLBulkOutputFormat would be better, but it doesn't supports
DELETE.
On Thu, Sep 24, 2015 at 1:27 PM, Gerard Maas <gerard.m...@gmail.com> wrote:
> How are you loading the data? I mean, what insert method are you using?
>
> On Thu, Sep 24, 2015 at 9:58
I have a small cluster with 3 nodes and installed Cassandra 2.1.2 from
DataStax YUM repository. I knew 2.1.2 is not recommended for production.
The problem I observed is:
- When I use vnode with num_token=256, the read latency is about 20ms
for 50 percentile.
- If I disable vnode, the
It didn't work. I ran the command on all nodes, but I still can see the
repair activities.
On Wed, Apr 15, 2015 at 3:20 PM, Sebastian Estevez
sebastian.este...@datastax.com wrote:
nodetool stop *VALIDATION*
On Apr 15, 2015 5:16 PM, Benyi Wang bewang.t...@gmail.com wrote:
I ran nodetool
I ran nodetool repair -- keyspace table for a table, and it is still
running after 4 days. I knew there is an issue for repair with vnodes
https://issues.apache.org/jira/browse/CASSANDRA-5220. My question is how I
can kill this sequential repair?
I killed the process which I ran the repair
Using JMX worked. Thanks a lot.
On Wed, Apr 15, 2015 at 3:57 PM, Robert Coli rc...@eventbrite.com wrote:
On Wed, Apr 15, 2015 at 3:30 PM, Benyi Wang bewang.t...@gmail.com wrote:
It didn't work. I ran the command on all nodes, but I still can see the
repair activities.
Your input
it sound like -pr runs on one node?
I'm still don't understand the first range returned by the partitioned for
a node?
On Mon, Apr 13, 2015 at 1:40 PM, Robert Coli rc...@eventbrite.com wrote:
On Mon, Apr 13, 2015 at 1:36 PM, Benyi Wang bewang.t...@gmail.com wrote:
- I need to run compaction
I read the document for several times, but I still not quite sure how to
run repair and compaction.
To my understanding,
- I need to run compaction one each node,
- To repair a table (column family), I only need to run repair on any of
nodes.
Am I right?
Thanks.
? Are you by any chance running
repairs on your data?
On Mon, Mar 30, 2015 at 5:39 PM, Benyi Wang bewang.t...@gmail.com wrote:
Thanks for replying.
In cqlsh, if I change to Quorum (Consistency quorum), sometime the select
return the deleted row, sometime not.
I have two virtual data centers
going on (especially if
you have an example of the wrong data being returned) is to do a
sstable2json on all your tables and simply grep for an example key.
On Mon, Mar 30, 2015 at 4:39 PM, Benyi Wang bewang.t...@gmail.com wrote:
Thanks for replying.
In cqlsh, if I change to Quorum
Create table tomb_test (
guid text,
content text,
range text,
rank int,
id text,
cnt int
primary key (guid, content, range, rank)
)
Sometime I delete the rows using cassandra java driver using this query
DELETE FROM tomb_test WHERE guid=? and content=? and range=?
in Batch
results.
How many nodes do you have in the cluster and what is the replication
factor for the keyspace?
On Mon, Mar 30, 2015 at 7:41 PM, Benyi Wang bewang.t...@gmail.com wrote:
Create table tomb_test (
guid text,
content text,
range text,
rank int,
id text,
cnt int
In C* 2.1.2, is there a way you can delete without specifying the row key?
create table (
guid text,
key1 text,
key2 text,
data int
primary key (guid, key1, key2)
);
delete from a_table where key1='kv1' and key2='kv2';
I'm trying to avoid doing like this:
* query the table to get
On Fri, Jan 9, 2015 at 3:55 PM, Robert Coli rc...@eventbrite.com wrote:
On Fri, Jan 9, 2015 at 11:38 AM, Benyi Wang bewang.t...@gmail.com wrote:
- Is it possible to modify SSTableLoader to allow it access one data
center?
Even if you only write to nodes in DC A, if you replicate
in
this case would be 3x more with the bulk loader than with using a
traditional cql or thrift based client.
On Wed, Jan 7, 2015 at 6:32 PM, Benyi Wang bewang.t...@gmail.com
wrote:
I set up two virtual data centers, one for analytics and one for REST
service. The analytics data center
I set up two virtual data centers, one for analytics and one for REST
service. The analytics data center sits top on Hadoop cluster. I want to
bulk load my ETL results into the analytics data center so that the REST
service won't have the heavy load. I'm using CQLTableInputFormat in my
Spark
CQLSSTableWriter only accepts an INSERT or UPDATE statement. I'm wondering
whether make it accept DELETE statement.
I need to update my cassandra table with a lot of data everyday.
* I may need to delete a row (given the partition key)
* I may need to delete some columns. For example, there are
We have one ring and two virtual data centers in our Cassandra cluster? one
is for Real-Time and the other is for analytics. My questions are:
1. Are there memtables in Analytics Data Center? To my understanding, it
is true.
2. Is it possible to flush memtables if exist in Analytics Data
Is there a page explaining what happens at server side when using
SSTableLoader?
I'm seeking the answers of the following questions:
1. What's about the existing data in the table? From my test, the data
in sstable files will be applied to the existing data. Am I right?
- The new
27 matches
Mail list logo