答复: A difficult data model with C*

2016-11-08 Thread ben ben
Hi Vladimir Yudovin, Thank you very much for your detailed explaining. Maybe I didn't describe the requirement clearly. The use cases should be: 1. a user login our app. 2. show the recent ten movies watched by the user within 30 days. 3. the user can click any one of the ten movie and

Re: failure node rejoin

2016-11-08 Thread Ben Slater
There have been a few commit log bugs around in the last couple of months so perhaps you’ve hit something that was fixed recently. Would be interesting to know the problem is still occurring in 2.2.8. I suspect what is happening is that when you do your initial read (without flush) to check the

Re: failure node rejoin

2016-11-08 Thread Yuji Ito
I tried C* 3.0.9 instead of 2.2. The data lost problem hasn't happen for now (without `nodetool flush`). Thanks On Fri, Nov 4, 2016 at 3:50 PM, Yuji Ito wrote: > Thanks Ben, > > When I added `nodetool flush` on all nodes after step 2, the problem > didn't happen. > Did

Re: How to confirm TWCS is fully in-place

2016-11-08 Thread Oskar Kjellin
Hi, You could manually trigger it with nodetool compact. /Oskar > On 8 nov. 2016, at 21:47, Lahiru Gamathige wrote: > > Hi Users, > > I am thinking of migrating our timeseries tables to use TWCS. I am using JMX > to set the new compaction and one node at a time and I

How to confirm TWCS is fully in-place

2016-11-08 Thread Lahiru Gamathige
Hi Users, I am thinking of migrating our timeseries tables to use TWCS. I am using JMX to set the new compaction and one node at a time and I am not sure how to confirm that after the flush all the compaction is done in each node. I tried this in a small cluster but after setting the compaction I

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Alain Rastoul
On 11/08/2016 08:52 PM, Alain Rastoul wrote: For example if you had to track the position of a lot of objects, instead of updating the object records, each second you could insert a new event with : (object: object_id, event_type: position_move, position : x, y ). and add a timestamp of

Re: Slow performance after upgrading from 2.0.9 to 2.1.11

2016-11-08 Thread Dikang Gu
Michael, thanks for the info. It sounds to me a very serious performance regression. :( On Tue, Nov 8, 2016 at 11:39 AM, Michael Kjellman < mkjell...@internalcircle.com> wrote: > Yes, We hit this as well. We have a internal patch that I wrote to mostly > revert the behavior back to ByteBuffers

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Alain Rastoul
On 11/08/2016 11:05 AM, DuyHai Doan wrote: Are you sure Cassandra is a good fit for this kind of heavy update & delete scenario ? +1 this sounds like relational thinking scenario... (no offense, I like relational systems) As if you want to maintain the state of a lot of entities with updates

Re: Are Cassandra writes are faster than reads?

2016-11-08 Thread Ben Bromhead
Awesome! For a full explanation of what you are seeing (we call it micro batching) check out Adam Zegelins talk on it https://www.youtube.com/watch?v=wF3Ec1rdWgc On Tue, 8 Nov 2016 at 02:21 Rajesh Radhakrishnan < rajesh.radhakrish...@phe.gov.uk> wrote: > > Hi, > > Just found that reducing the

Re: A difficult data model with C*

2016-11-08 Thread Vladimir Yudovin
Hi Ben, if need very limited number of positions (as you said ten) may be you can store them in LIST of UDT? Or just as JSON string? So you'll have one row per each pair user-video. It can be something like this: CREATE TYPE play (position int, last_time timestamp); CREATE TABLE

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Vladimir Yudovin
Yes, as doc says "Expired data is marked with a tombstone" but you save communication with host and processing of DELETE operator. Best regards, Vladimir Yudovin, Winguzone - Hosted Cloud Cassandra Launch your cluster in minutes. On Tue, 08 Nov 2016 09:32:16 -0500Ali Akhtar

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Hannu Kröger
Also in they are being read before compaction: http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_c.html Hannu > On 8 Nov 2016, at 16.36, DuyHai Doan wrote: > > "Does TTL also cause

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread DuyHai Doan
"Does TTL also cause tombstones?" --> Yes, after the TTL expires, at the next compaction the TTLed column is replaced by a tombstone, as per my understanding On Tue, Nov 8, 2016 at 3:32 PM, Ali Akhtar wrote: > Does TTL also cause tombstones? > > On Tue, Nov 8, 2016 at 6:57

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Ali Akhtar
Does TTL also cause tombstones? On Tue, Nov 8, 2016 at 6:57 PM, Vladimir Yudovin wrote: > >The deletes will be done at a scheduled time, probably at the end of the > day, each day. > > Probably you can use TTL? http://docs.datastax.com/en/ >

Re: Designing a table in cassandra

2016-11-08 Thread Vladimir Yudovin
Hi Sathish, probably I didn't catch exactly your requirements, but why not create single table for all devices, and represent each device as rows, storing both user and network configuration per device. You can use MAP for flexible storage model. If you have thousandth of devices creating

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Vladimir Yudovin
The deletes will be done at a scheduled time, probably at the end of the day, each day. Probably you can use TTL? http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_c.html Best regards, Vladimir Yudovin, Winguzone - Hosted Cloud Cassandra Launch your cluster in minutes.

Re: store individual inventory items in a table, how to assign them correctly

2016-11-08 Thread Vladimir Yudovin
Hi, can you elaborate a little your data model? Would you like to create 100 rows for each product and then remove one row and add this row to customer? Best regards, Vladimir Yudovin, Winguzone - Hosted Cloud Cassandra Launch your cluster in minutes. On Mon, 07 Nov 2016

Re: operation and maintenance tools

2016-11-08 Thread Vladimir Yudovin
For memory usage you can use small command line tool https://github.com/patric-r/jvmtop Also there are number of GUI tools that connect to JMX port, like jvisualvm Best regards, Vladimir Yudovin, Winguzone - Hosted Cloud Cassandra Launch your cluster in minutes. On Mon, 07 Nov

Re: store individual inventory items in a table, how to assign them correctly

2016-11-08 Thread Carlos Alonso
Bear in mind that LWT will, under certain circumstances fail too. See amazing Chris Batey's talk about it on Cassandra Summit: https://www.youtube.com/watch?v=wcxQM3ZN20c Carlos Alonso | Software Engineer | @calonso On 7 November 2016 at 22:22, Justin Cameron

RE: Are Cassandra writes are faster than reads?

2016-11-08 Thread Rajesh Radhakrishnan
Hi, Just found that reducing the batch size below 20 also increases the writing speed and reduction in memory usage(especially for Python driver). Kind regards, Rajesh R From: Ben Bromhead [b...@instaclustr.com] Sent: 07 November 2016 05:44 To:

RE: Cassandra Python Driver : execute_async consumes lots of memory?

2016-11-08 Thread Rajesh Radhakrishnan
Hi Lahiru, Great! you know what, REDUCTION of BATCH size from 50 to 20 solved my issue. Thank you very much. Good job man! and Memory issue solved. Next I will try using Spark to speed it up. Kind regards, Rajesh Radhakrishnan From: Lahiru Gamathige

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Ali Akhtar
Yes, because there will also be a lot of inserts, and the linear scalability that c* offers is required. But the inserts aren't static, and the data that comes in will need to be updated in response to user events. Data which hasn't been touched for over a week has to be deleted. (Sensitive

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread DuyHai Doan
Are you sure Cassandra is a good fit for this kind of heavy update & delete scenario ? Otherwise, you can always use several tables (one table/day, rotating through 7 days for a week) and do a truncate of the table at the end of the day. On Tue, Nov 8, 2016 at 11:04 AM, Ali Akhtar

Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Ali Akhtar
I have a use case where a lot of updates and deletes to a table will be necessary. The deletes will be done at a scheduled time, probably at the end of the day, each day. Updates will be done throughout the day, as new data comes in. Are there any guidelines on improving cassandra's performance

RE: Are Cassandra writes are faster than reads?

2016-11-08 Thread Rajesh Radhakrishnan
Hi, In my case writing is slower using Python driver, using Batch execution and prepared statements. I am looking at different ways to speed it up, as I am trying to write 100 * 200 Million records . Cheers Rajesh R From: Vikas Jaiman [er.vikasjai...@gmail.com]

RE: Cassandra Python Driver : execute_async consumes lots of memory?

2016-11-08 Thread Rajesh Radhakrishnan
Hi Lahiru, Thank you for the reply. I will try reducing the batch size to 20 and see how much memory usage I can reduce. I might try Spark streaming too. Cheers! Kind regards, Rajesh R From: Lahiru Gamathige [lah...@highfive.com] Sent: 07 November 2016 17:10