Re: How to minimize side effects induced by tombstones when using deletion?

2017-08-01 Thread Jing Meng
be collected > more frequently > > > -- > Jeff Jirsa > > > On Jul 31, 2017, at 11:02 PM, Jing Meng <self.rel...@gmail.com> wrote: > > Hi there. > > > We have a keyspace containing tons of records, and deletions are used as > enforced by its business lo

How to minimize side effects induced by tombstones when using deletion?

2017-08-01 Thread Jing Meng
Hi there. We have a keyspace containing tons of records, and deletions are used as enforced by its business logic. As the data accumulates, we are suffering from performance penalty due to tombstones, still confusing about what could be done to minimize the harm, or shall we avoid any deletions

Re: Suggestions for migrating data from cassandra

2018-05-16 Thread Jing Meng
SQL. > > > Could you use other paths such as: > >- StreamSets >- Talend Open Studio >- Kafka Streams. > > > > > 2018-05-15 4:59 GMT-06:00 Jing Meng <self.rel...@gmail.com>: > >> Hi guys, for some historical reason, our cassandra clust

Suggestions for migrating data from cassandra

2018-05-15 Thread Jing Meng
Hi guys, for some historical reason, our cassandra cluster is currently overloaded and operating on that somehow becomes a nightmare. Anyway, (sadly) we're planning to migrate cassandra data back to mysql... So we're not quite clear how to migrating the historical data from cassandra. While as I

Question upon gracefully restarting c* node(s)

2018-01-01 Thread Jing Meng
Hi all. Recently we made a change to our production env c* cluster (2.1.18) - placing the commit log to the same SSD where data is stored, which needs restarting all nodes. Before restarting a cassandra node, we ran the following nodetool utils: $ nodetool disablethrift && sleep 5 $ nodetool

Wondering how cql3 DISTINCT query is implemented

2018-10-22 Thread Jing Meng
Hi, we built a simple system to migrate live cassandra data to other databases, mainly by using these queries: 1. SELECT DISTINCT TOKEN(partition_key) FROM table WHERE TOKEN(partition_key) > current_offset AND TOKEN(partition_key) <= upper_bound LIMIT token_fetch_size 2. Any cql query that