Re: Inconsistencies in materialized views

2016-10-20 Thread siddharth verma
Hi Edward, Thanks a lot for your help. It helped us narrow down the problem. Regards On Mon, Oct 17, 2016 at 9:33 PM, Edward Capriolo wrote: > https://issues.apache.org/jira/browse/CASSANDRA-11198 > > Which has problems "maybe" fixed by: > >

time series data model

2016-10-20 Thread wxn...@zjqunshuo.com
Hi All, I'm trying to migrate my time series data which is GPS trace from mysql to C*. I want a wide row to hold one day data. I designed the data model as below. Please help to see if there is any problem. Any suggestion is appreciated. Table Model: CREATE TABLE cargts.eventdata ( deviceid

Re: time series data model

2016-10-20 Thread Vladimir Yudovin
Hi Simon, Why position is text and not float? Text takes much more place. Also speed and headings can be calculated basing on latest positions, so you can also save them. If you really need it in data base you can save them as floats, or compose single float value like speed.heading: 41.173 (or

Re: non incremental repairs with cassandra 2.2+

2016-10-20 Thread kurt Greaves
Welp, that's good but wasn't apparent in the codebase :S. Kurt Greaves k...@instaclustr.com www.instaclustr.com On 20 October 2016 at 05:02, Alexander Dejanovski wrote: > Hi Kurt, > > we're not actually. > Reaper performs full repair by subrange but does incremental

Re: time series data model

2016-10-20 Thread kurt Greaves
If event_time is timestamps since unix epoch you 1. may want to use the in-built timestamps type, and 2. order by event_time DESC. 2 applies if you want to do queries such as "select * from eventdata where ... and event_time > x" (i.e; get latest events). Other than that your model seems

Re: non incremental repairs with cassandra 2.2+

2016-10-20 Thread kurt Greaves
probably because i was looking the wrong version of the codebase :p

Re: time series data model

2016-10-20 Thread wxn...@zjqunshuo.com
Hi Kurt, I do need to align the time windows to day bucket to prevent one row become too big, and event_time is timestamp since unix epoch. If I use bigint as type of event_time, can I do queries as you mentioned? -Simon Wu From: kurt Greaves Date: 2016-10-20 16:18 To: user Subject: Re: time

Re: time series data model

2016-10-20 Thread kurt Greaves
Ah didn't pick up on that but looks like he's storing JSON within position. Is there any strong reason for this or as Vladimir mentioned can you store the fields under "position" in separate columns? Kurt Greaves k...@instaclustr.com www.instaclustr.com On 20 October 2016 at 08:17, Vladimir

Re: time series data model

2016-10-20 Thread wxn...@zjqunshuo.com
Thank you Kurt, I thought the one column which was identified by the compsite key(deviceId+date+event_time) can hold only one value, so I packaged all info into one JSON. Maybe I'm wrong. I rewrite the table as below. CREATE TABLE cargts.eventdata ( deviceid int, date int,

Handle Leap Seconds with Cassandra

2016-10-20 Thread Anuj Wadehra
Hi, I would like to know how you guys handle leap seconds with Cassandra.  I am not bothered about the livelock issue as we are using appropriate versions of Linux and Java. I am more interested in finding an optimum answer for the following question: How do you handle wrong ordering of multiple

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Harikrishnan Pillai
We use Cassandra to store images .any data above 2 mb we chunk it and store.it works perfectly . Sent from my iPhone > On Oct 20, 2016, at 12:09 PM, Vikas Jaiman wrote: > > Hi, > > Normally people would like to store smaller values in Cassandra. Is there > anyone

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Justin Cameron
You can, but it is not really very efficient or cost-effective. You may encounter issues with streaming, repairs and compaction if you have very large blobs (100MB+), so try to keep them under 10MB if possible. I'd suggest storing blobs in something like Amazon S3 and keeping just the bucket name

Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Vikas Jaiman
Hi, Normally people would like to store smaller values in Cassandra. Is there anyone using it to store for larger values (e.g 500KB or more) and if so what are the issues you are facing . I Would like to know the tweaks also which you are considering. Thanks, Vikas

Re: Handle Leap Seconds with Cassandra

2016-10-20 Thread Ben Bromhead
http://www.datastax.com/dev/blog/preparing-for-the-leap-second gives a pretty good overview If you are using a timestamp as part of your primary key, this is the situation where you could end up overwriting data. I would suggest using timeuuid instead which will ensure that you get different

Cluster Maintenance Mishap

2016-10-20 Thread Branton Davis
Howdy folks. I asked some about this in IRC yesterday, but we're looking to hopefully confirm a couple of things for our sanity. Yesterday, I was performing an operation on a 21-node cluster (vnodes, replication factor 3, NetworkTopologyStrategy, and the nodes are balanced across 3 AZs on AWS

Re: Cluster Maintenance Mishap

2016-10-20 Thread Yabin Meng
Most likely the issue is caused by the fact that when you move the data, you move the system keyspace data away as well. Meanwhile, due to the error of data being copied into a different location than what C* is expecting, when C* starts, it can not find the system metadata info and therefore

Rebuild failing while adding new datacenter

2016-10-20 Thread Jai Bheemsen Rao Dhanwada
Hello All, I have single datacenter with 3 C* nodes and we are trying to expand the cluster to another region/DC. I am seeing the below error while doing a "nodetool rebuild -- name_of_existing_data_center" . [user@machine ~]$ nodetool rebuild DC1 nodetool: Unable to find sufficient sources for

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread sai krishnam raju potturi
we faced a similar issue earlier, but that was more related to firewall rules. The newly added datacenter was not able to communicate with the existing datacenters on the port 7000(inter-node communication). Your's might be a different issue, but just saying. On Thu, Oct 20, 2016 at 4:12 PM, Jai

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Yabin Meng
I have seen this on other releases, on 2.2.x. The workaround is exactly like yours, some other system keyspaces also need similar changes. I would say this is a benign bug. Yabin On Thu, Oct 20, 2016 at 4:41 PM, Jai Bheemsen Rao Dhanwada < jaibheem...@gmail.com> wrote: > thanks, > > This

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Jai Bheemsen Rao Dhanwada
thanks, This always works on 2.1.13 and 2.1.16 version but not on 3.0.8. definitely not a firewall issue On Thu, Oct 20, 2016 at 1:16 PM, sai krishnam raju potturi < pskraj...@gmail.com> wrote: > we faced a similar issue earlier, but that was more related to firewall > rules. The newly added

Re: Cluster Maintenance Mishap

2016-10-20 Thread Branton Davis
Thanks for the response, Yabin. However, if there's an answer to my question here, I'm apparently too dense to see it ;) I understand that, since the system keyspace data was not there, it started bootstrapping. What's not clear is if they took over the token ranges of the previous nodes or got

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Jai Bheemsen Rao Dhanwada
Thank you Yabin, is there a exisiting JIRA that I can refer to? On Thu, Oct 20, 2016 at 2:05 PM, Yabin Meng wrote: > I have seen this on other releases, on 2.2.x. The workaround is exactly > like yours, some other system keyspaces also need similar changes. > > I would say

Re: Introducing Cassandra 3.7 LTS

2016-10-20 Thread sankalp kohli
This is awesome. I have send out the patches which we back ported into 2.1 on the dev list. On Wed, Oct 19, 2016 at 4:33 PM, kurt Greaves wrote: > > On 19 October 2016 at 21:07, sfesc...@gmail.com > wrote: > >> Wow, thank you for doing this. This

Re: Introducing Cassandra 3.7 LTS

2016-10-20 Thread Ben Bromhead
Thanks Sankalp, we are also reviewing our internal 2.1 list against what you published (though we are trying to upgrade everyone to later versions e.g. 2.2). It's great to compare notes. On Thu, 20 Oct 2016 at 16:19 sankalp kohli wrote: > This is awesome. I have send out

Re: Introducing Cassandra 3.7 LTS

2016-10-20 Thread sankalp kohli
I will also publish 3.0 back ports once we are running 3.0 On Thu, Oct 20, 2016 at 4:23 PM, Ben Bromhead wrote: > Thanks Sankalp, we are also reviewing our internal 2.1 list against what > you published (though we are trying to upgrade everyone to later versions > e.g.

Re: failure node rejoin

2016-10-20 Thread Yuji Ito
Thanks Ben, I tried to run a rebuild and repair after the failure node rejoined the cluster as a "new" node with -Dcassandra.replace_address_first_boot. The failure node could rejoined and I could read all rows successfully. (Sometimes a repair failed because the node cannot access other node. If

Re: Cluster Maintenance Mishap

2016-10-20 Thread Yabin Meng
I believe you're using VNodes (because token range change doesn't make sense for single-token setup unless you change it explicitly). If you bootstrap a new node with VNodes, I think the way that the token ranges are assigned to the node is random (I'm not 100% sure here, but should be so

Re: failure node rejoin

2016-10-20 Thread Ben Slater
A couple of questions: 1) At what stage did you have (or expect to have) 1000 rows (and have the mismatch between actual and expected) - at that end of operation (2) or after operation (3)? 2) What replication factor and replication strategy is used by the test keyspace? What consistency level is

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Yabin Meng
Sorry, I'm not aware of it On Thu, Oct 20, 2016 at 6:00 PM, Jai Bheemsen Rao Dhanwada < jaibheem...@gmail.com> wrote: > Thank you Yabin, is there a exisiting JIRA that I can refer to? > > On Thu, Oct 20, 2016 at 2:05 PM, Yabin Meng wrote: > >> I have seen this on other

Re: Cluster Maintenance Mishap

2016-10-20 Thread Branton Davis
I guess I'm either not understanding how that answers the question and/or I've just a done a terrible job at asking it. I'll sleep on it and maybe I'll think of a better way to describe it tomorrow ;) On Thu, Oct 20, 2016 at 8:45 PM, Yabin Meng wrote: > I believe you're

Re: How to throttle up/down compactions without a restart

2016-10-20 Thread Jeff Jirsa
You can also set concurrent compactors through JMX – in the CompactionManager mbean, you have CoreCompactionThreads and MaxCompactionThreads – you can adjust them at runtime, but do it in an order such that Max is always higher than Core From: kurt Greaves

Re: How to throttle up/down compactions without a restart

2016-10-20 Thread kurt Greaves
You can throttle compactions using nodetool setcompactionthroughput . Where x is in mbps. If you're using 2.2 or later this applies immediately to all running compactions, otherwise it applies on any "new" compactions. You will want to be careful of allowing compactions to utilise too much disk

Re: failure node rejoin

2016-10-20 Thread Yuji Ito
thanks Ben, > 1) At what stage did you have (or expect to have) 1000 rows (and have the mismatch between actual and expected) - at that end of operation (2) or after operation (3)? after operation 3), at operation 4) which reads all rows by cqlsh with CL.SERIAL > 2) What replication factor and

Re: Cluster Maintenance Mishap

2016-10-20 Thread kurt Greaves
On 20 October 2016 at 20:58, Branton Davis wrote: > Would they have taken on the token ranges of the original nodes or acted > like new nodes and got new token ranges? If the latter, is it possible > that any data moved from the healthy nodes to the "new" nodes or >

How to throttle up/down compactions without a restart

2016-10-20 Thread Thomas Julian
Hello, I was going through this presentation and the Slide-55 caught my attention. i.e) "Throttled down compactions during high load period, throttled up during low load period" Can we throttle down compactions without a restart? If this can be done, what are all the

Re: failure node rejoin

2016-10-20 Thread Ben Slater
OK. Are you certain your tests don’t generate any overlapping inserts (by PK)? Cassandra basically treats any inserts with the same primary key as updates (so 1000 insert operations may not necessarily result in 1000 rows in the DB). On Fri, 21 Oct 2016 at 16:30 Yuji Ito

Re: Cluster Maintenance Mishap

2016-10-20 Thread Jeremiah D Jordan
The easiest way to figure out what happened is to examine the system log. It will tell you what happened. But I’m pretty sure your nodes got new tokens during that time. If you want to get back the data inserted during the 2 hours you could use sstableloader to send all the data from the

strange node load decrease after nodetool repair -pr

2016-10-20 Thread Oleg Krayushkin
Hi. After I've run token-ranged repair from node at 12.5.13.125 with nodetool repair -full -st ${start_tokens[i]} -et ${end_tokens[i]} on every token range, I got this node load: -- Address Load Tokens Owns Rack UN 12.5.13.141 23.94 GB 256 32.3% rack1 DN 12.5.13.125