Migrating from Datastax Distribution to Apache Cassandra

2017-04-07 Thread Eren Yilmaz
Hi, We have Cassandra 3.7 installation on Ubuntu, from Datastax distribution (using the repo). Since Datastax has announced that they will no longer support a community Cassandra distribution, I want to migrate to Apache distribution. Are there any differences between distributions? Can I use

cassandra node stops streaming data during nodetool rebuild

2017-04-07 Thread Roland Otta
hi, we are trying to setup a new datacenter and are initalizing the data with nodetool rebuild. after some hours it seems that the node stopped streaming (at least there is no more streaming traffic on the network interface). nodetool netstats shows that the streaming is still in progress

Re: cassandra node stops streaming data during nodetool rebuild

2017-04-07 Thread Jacob Shadix
What version are you running? Do you see any errors in the system.log (SocketTimeout, for instance)? And what values do you have for the following in cassandra.yaml: - - stream_throughput_outbound_megabits_per_sec - - compaction_throughput_mb_per_sec - - streaming_socket_timeout_in_ms -- Jacob

Re: cassandra node stops streaming data during nodetool rebuild

2017-04-07 Thread Roland Otta
Hi! we are on 3.7. we have some debug messages ... but i guess they are not related to that issue DEBUG [GossipStage:1] 2017-04-07 13:11:00,440 FailureDetector.java:456 - Ignoring interval time of 2002469610 for /192.168.0.27 DEBUG [GossipStage:1] 2017-04-07 13:11:00,441

Re: cassandra node stops streaming data during nodetool rebuild

2017-04-07 Thread Jacob Shadix
Did you look at the logs on the source DC as well? How big is the dataset? -- Jacob Shadix On Fri, Apr 7, 2017 at 7:16 AM, Roland Otta wrote: > Hi! > > we are on 3.7. > > we have some debug messages ... but i guess they are not related to that > issue > DEBUG

Re: cassandra node stops streaming data during nodetool rebuild

2017-04-07 Thread Roland Otta
good point! on the source side i can see the following error ERROR [STREAM-OUT-/192.168.0.114:34094] 2017-04-06 17:18:56,532 StreamSession.java:529 - [Stream #41606030-1ad9-11e7-9f16-51230e2be4e9] Streaming error occurred on session with peer 10.192.116.1 through 192.168. 0.114

Re: Copy from CSV on OS X problem with varint values <= -2^63

2017-04-07 Thread Brice Dutheil
@Boris, what formula did you use on homebrew and what is the git version of this formula ? Anyway the current cassandra formula is here : https://github.com/Homebrew/homebrew-core/blob/master/Formula/cassandra.rb I am not a Homebrew developper, the formula does a lot of facy stuff, yet I see a

AW: How does clustering key works with TimeWindowCompactionStrategy (TWCS)

2017-04-07 Thread j.kesten
Hi Jerry, the compaction strategy just tells Cassandra how to compact your sstables and with TWCS when to stop compacting further. But of course your data can and most likely will live in multiple sstables. The magic that happens is the the coordinator node for your request will merge the

Re: The changing clustering key

2017-04-07 Thread Monmohan Singh
*"your primary goal is to fetch a user by dept_id and user_id and additionally keep versions of the user data?"* My primary goal was to just fetch users for a dept, sorted by modified date. Now the limitation from cassandra that mod_date can't be a clustering key if it can be updated forces me to

too many compactions pending and compaction is slow on few tables

2017-04-07 Thread Giri P
Hi, we are continuously loading a table which has properties properties compaction strategy LCS and bloom filter off and compactions are not catching up . Even the compaction is running slow on that table even after we increases throughput and concurrent compactors. Can someone point me to what

Re: Migrating from Datastax Distribution to Apache Cassandra

2017-04-07 Thread Michael Shuler
Example DDC 3.7.0 to Apache Cassandra 3.10 upgrade with all default configs, no data, and both the DDC and Apache Cassandra lines in sources.list (sorry for any weird wrapping, but I think the list strips attachments): mshuler@hana:~$ apt-cache policy datastax-ddc cassandra datastax-ddc:

Re: too many compactions pending and compaction is slow on few tables

2017-04-07 Thread Giri P
cassandra version : 2.1 volume : initially loading 28 days worth of data around 1 TB and then we process hourly load: only cassandra running on nodes disks: spinning disks On Fri, Apr 7, 2017 at 11:27 AM, Jonathan Haddad wrote: > What version of Cassandra? How much data?

Re: How does clustering key works with TimeWindowCompactionStrategy (TWCS)

2017-04-07 Thread Jon Haddad
Alex Dejanovski wrote a good post on how the LIMIT clause works and why it doesn’t (until 3.4) work the way you think it would. http://thelastpickle.com/blog/2017/03/07/The-limit-clause-in-cassandra-might-not-work-as-you-think.html > On Apr 7, 2017, at 7:23 AM, Jerry Lam

Re: How does clustering key works with TimeWindowCompactionStrategy (TWCS)

2017-04-07 Thread Jonathan Haddad
Hey Jerry - very happy to hear the post answered your questions. Alex wrote another great post on TWCS you might find useful, since you're using it: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html On Fri, Apr 7, 2017 at 8:20 AM Jerry Lam wrote: > Hi Jon, > >

Re: Migrating from Datastax Distribution to Apache Cassandra

2017-04-07 Thread daemeon reiydelle
Having done variants of this, I would suggest you bring up new nodes at approximately the same Apache version as a separate data center, in your same cluster. Replication strategy may need to be tweaked *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On

Re: too many compactions pending and compaction is slow on few tables

2017-04-07 Thread Giri P
Does LCS try compacting already compacted files if it see same key loaded again ? On Fri, Apr 7, 2017 at 11:39 AM, Giri P wrote: > cassandra version : 2.1 > volume : initially loading 28 days worth of data around 1 TB and then we > process hourly > load: only cassandra

Re: too many compactions pending and compaction is slow on few tables

2017-04-07 Thread Matija Gobec
It does as the "new" data, even if the values are the same, has new write time timestamp. Spinning disks are hard to run LCS on. Do you maybe have some kind of non stripe raid in place? On Fri, Apr 7, 2017 at 8:46 PM, Giri P wrote: > Does LCS try compacting already compacted

Re: Migrating from Datastax Distribution to Apache Cassandra

2017-04-07 Thread Michael Shuler
This is prudent advice, but a rolling upgrade from DDC 3.7.0 to Apache Cassandra 3.10, after updating your sources.list should also work fine. Just back up all your configurations, and if your data is mission critical, follow good backup strategy for that, too. Testing the upgrade in your

Re: too many compactions pending and compaction is slow on few tables

2017-04-07 Thread Jonathan Haddad
What version of Cassandra? How much data? How often are you reloading it? Is compaction throttled? What disks are you using? Any other load on the machine? On Fri, Apr 7, 2017 at 11:19 AM Giri P wrote: > Hi, > > we are continuously loading a table which has properties

Re: How does clustering key works with TimeWindowCompactionStrategy (TWCS)

2017-04-07 Thread Jerry Lam
Hi Jan, Thank you for the clarification and knowledge sharing. A follow-up question is: Does Cassandra need to read all sstables for customer_id = 1L if my query is: select view_id from customer_view where customer_id = 1L limit 1 Since I have the date_day as the clustering key and it is

Re: Node always dieing

2017-04-07 Thread Cogumelos Maravilha
There's a tweak. I've forgot to put this in the new instance: At /lib/udev/rules.d/ |cat ||40-vm-hotadd.rules**||# On Hyper-V and Xen Virtual Machines we want to add memory and cpus as soon as they appear| |ATTR{[dmi/id]sys_vendor}=="Microsoft Corporation", ATTR{[dmi/id]product_name}=="Virtual

Re: cassandra node stops streaming data during nodetool rebuild

2017-04-07 Thread Jacob Shadix
I don't see an issue with the size of the data / node. You can attempt the rebuild again and play around with throughput if your network can handle it. It can be changed on-the-fly with nodetool: nodetool setstreamthroughput This article is also worth a read -

Re: How does clustering key works with TimeWindowCompactionStrategy (TWCS)

2017-04-07 Thread Jerry Lam
Hi Jon, This Cassandra community is very helpful!!! Thanks for sharing this blogpost with me. It answers all my questions related to TWCS with clustering key and limit clause! Best Regards, Jerry On Fri, Apr 7, 2017 at 10:30 AM, Jon Haddad wrote: > Alex

Re: too many compactions pending and compaction is slow on few tables

2017-04-07 Thread Carlos Rolo
Is not a good idea to do LCS on spinning. Change to STCS, and reduce the compactors to 2 (if you have more than 2). Check if that helps. On Apr 7, 2017 20:18, "Matija Gobec" wrote: > It does as the "new" data, even if the values are the same, has new write > time