Re: Contributing cassandra-diff

2019-08-22 Thread Roopa Tangirala
Markus,

This is great and very helpful for anyone running Cassandra in production
and have peace of mind to roll out upgrades. Thank you !

*Director, Cloud Data Engineering*


*Regards,Roopa Tangirala*



On Thu, Aug 22, 2019 at 9:14 AM Michael Shuler 
wrote:

> CI git polling for changes on a separate repository (if/when CI is
> needed) is probably a better way to go. I don't believe there are any
> issues with INFRA on us having discrete repos, and creating them with
> the self-help web tool is quick and easy.
>
> Thanks for the neat looking utility!
>
> Michael
>
> On 8/22/19 10:33 AM, Sankalp Kohli wrote:
> > A different repo will be better
> >
> >> On Aug 22, 2019, at 6:16 AM, Per Otterström <
> per.otterst...@ericsson.com> wrote:
> >>
> >> Very powerful tool indeed, thanks for sharing!
> >>
> >> I believe it is best to keep tools like this in different repos since
> different tools will probably have different life cycles and tool chains.
> Yes, that could be handled in a single repo, but with different repos we'd
> get natural boundaries.
> >>
> >> -Original Message-
> >> From: Sumanth Pasupuleti 
> >> Sent: den 22 augusti 2019 14:40
> >> To: dev@cassandra.apache.org
> >> Subject: Re: Contributing cassandra-diff
> >>
> >> No hard preference on the repo, but just excited about this tool!
> Looking forward to employing this for upgrade testing (very timely :))
> >>
> >>> On Thu, Aug 22, 2019 at 3:38 AM Sam Tunnicliffe 
> wrote:
> >>>
> >>> My own weak preference would be for a dedicated repo in the first
> >>> instance. If/when additional tools are contributed we should look at
> >>> co-locating common stuff, but rushing toward a monorepo would be a
> >>> mistake IMO.
> >>>
> >>>>> On 22 Aug 2019, at 11:10, Jeff Jirsa  wrote:
> >>>>
> >>>> I weakly prefer contrib.
> >>>>
> >>>>
> >>>> On Thu, Aug 22, 2019 at 12:09 PM Marcus Eriksson
> >>>> 
> >>> wrote:
> >>>>
> >>>>> Hi, we are about to open source our tooling for comparing two
> >>>>> cassandra clusters and want to get some feedback where to push it.
> >>>>> I think the options are: (name bike-shedding welcome)
> >>>>>
> >>>>> 1. create repos/asf/cassandra-diff.git 2. create a generic
> >>>>> repos/asf/cassandra-contrib.git where we can add
> >>> more
> >>>>> contributed tools in the future
> >>>>>
> >>>>> Temporary location:
> >>>>> https://protect2.fireeye.com/url?k=e8982d07-b412e678-e8986d9c-86717
> >>>>> 581b0b5-292bc820a13b7138=1=https%3A%2F%2Fgithub.com%2Fkrummas%2
> >>>>> Fcassandra-diff
> >>>>>
> >>>>> Cassandra-diff is a spark job that compares the data in two
> >>>>> clusters -
> >>> it
> >>>>> pages through all partitions and reads all rows for those
> >>>>> partitions in both clusters to make sure they are identical. Based
> >>>>> on the
> >>> configuration
> >>>>> variable “reverse_read_probability† the rows are either read
> >>>>> forward or
> >>> in
> >>>>> reverse order.
> >>>>>
> >>>>> Our main use case for cassandra-diff has been to set up two
> >>>>> identical clusters, transfer a snapshot from the cluster we want to
> >>>>> test to these clusters and upgrade one side. When that is done we
> >>>>> run this tool to
> >>> make
> >>>>> sure that 2.1 and 3.0 gives the same results. A few examples of the
> >>> bugs we
> >>>>> have found using this tool:
> >>>>>
> >>>>> * CASSANDRA-14823: Legacy sstables with range tombstones spanning
> >>> multiple
> >>>>> index blocks create invalid bound sequences on 3.0+
> >>>>> * CASSANDRA-14803: Rows that cross index block boundaries can cause
> >>>>> incomplete reverse reads in some cases
> >>>>> * CASSANDRA-15178: Skipping illegal legacy cells can break reverse
> >>>>> iteration of indexed partitions
> >>>>>
> >>>>> /Marcus
> >>>>>
> >>>>> ---
> >>>>> -- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>>
> >>>>>
> >>>
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>
> >>>
> >>
> B‹CB• È
> [œÝXœØÜšX™K  K[XZ[ ˆ  ]‹][œÝXœØÜšX™P Ø\ÜØ[™ ˜K˜\ XÚ K›Ü™ÃB‘›Üˆ Y  ] [Û˜[
> ÛÛ[X[™ Ë  K[XZ[ ˆ  ]‹Z [   Ø\ÜØ[™ ˜K˜\ XÚ K›Ü™ÃBƒB
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Accept GoCQL driver donation and begin incubation process

2018-09-12 Thread Roopa Tangirala
+1


*Regards,*

*Roopa Tangirala*

Engineering Manager CDE

*(408) 438-3156 - mobile*






On Wed, Sep 12, 2018 at 8:51 AM Sylvain Lebresne  wrote:

> -0
>
> The project seems to have a hard time getting on top of reviewing his
> backlog
> of 'patch available' issues, so that I'm skeptical adopting more code to
> maintain is the thing the project needs the most right now. Besides, I'm
> also
> generally skeptical that augmenting the scope of a project makes it better:
> I feel
> keeping this project focused on the core server is better. I see risks
> here, but
> the upsides haven't been made very clear for me, even for end users: yes,
> it
> may provide a tiny bit more clarity around which Golang driver to choose by
> default, but I'm not sure users are that lost, and I think there is other
> ways to
> solve that if we really want.
>
> Anyway, I reckon I may be overly pessimistic here and it's not that strong
> of
> an objection if a large majority is on-board, so giving my opinion but not
> opposing.
>
> --
> Sylvain
>
>
> On Wed, Sep 12, 2018 at 5:36 PM Jeremiah D Jordan <
> jeremiah.jor...@gmail.com>
> wrote:
>
> > +1
> >
> > But I also think getting this through incubation might take a while/be
> > impossible given how large the contributor list looks…
> >
> > > On Sep 12, 2018, at 10:22 AM, Jeff Jirsa  wrote:
> > >
> > > +1
> > >
> > > (Incubation looks like it may be challenging to get acceptance from all
> > existing contributors, though)
> > >
> > > --
> > > Jeff Jirsa
> > >
> > >
> > >> On Sep 12, 2018, at 8:12 AM, Nate McCall  wrote:
> > >>
> > >> This will be the same process used for dtest. We will need to walk
> > >> this through the incubator per the process outlined here:
> > >>
> > >>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__incubator.apache.org_guides_ip-5Fclearance.html=DwIFAg=adz96Xi0w1RHqtPMowiL2g=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g=g-MlYFZVJ7j5Dj_ZfPfa0Ik8Nxco7QsJhTG1TnJH7xI=rk5T_t1HZY6PAhN5XgflBhfEtNrcZkVTIvQxixDlw9o=
> > >>
> > >> Pending the outcome of this vote, we will create the JIRA issues for
> > >> tracking and after we go through the process, and discuss adding
> > >> committers in a separate thread (we need to do this atomically anyway
> > >> per general ASF committer adding processes).
> > >>
> > >> Thanks,
> > >> -Nate
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>


Re: [VOTE] Development Approach for Apache Cassandra Management process

2018-09-12 Thread Roopa Tangirala
+1 to b Take the best from existing side cars and make a great side car
which ships with cassandra


*Regards,*

*Roopa Tangirala*

Engineering Manager CDE

*(408) 438-3156 - mobile*






On Wed, Sep 12, 2018 at 8:19 AM sankalp kohli 
wrote:

> Hi,
> Community has been discussing about Apache Cassandra Management process
> since April and we had lot of discussion about which approach to take to
> get started. Several contributors have been interested in doing this and we
> need to make a decision of which approach to take.
>
> The current approaches being evaluated are
> a. Donate an existing project to Apache Cassandra like Reaper. If this
> option is selected, we will evaluate various projects and see which one
> fits best.
> b. Take a piecemeal approach and use the features from different OSS
> projects and build a new project.
>
> Available options to vote
> a. +1 to use existing project.
> b. +1 to take piecemeal approach
> c  -1 to both
> d +0 I dont mind either option
>
> You can also just type a,b,c,d as well to chose an option.
>
> Dev threads with discussions
>
>
> https://lists.apache.org/thread.html/4eace8cb258aab83fc3a220ff2203a281ea59f4d6557ebeb1af7b7f1@%3Cdev.cassandra.apache.org%3E
>
>
> https://lists.apache.org/thread.html/4a7e608c46aa2256e8bcb696104a4e6d6aaa1f302834d211018ec96e@%3Cdev.cassandra.apache.org%3E
>


Re: Reaper as cassandra-admin

2018-08-28 Thread Roopa Tangirala
I share Dinesh's concern too regarding tech debt with existing codebase.
Its good we have multiple solutions for repairs which have been always
painful in Cassandra. It would be great to see the community take the best
pieces from the available solutions and roll it into the fresh side car
which will help ease Cassandra's maintenance for lot of folks.

My main concern with starting with an existing codebase is that it comes
with tech debt. This is not specific to Reaper but to any codebase that is
imported as a whole. This means future developers and patches have to work
within the confines of the decisions that were already made. Practically
speaking once a codebase is established there is inertia in making
architectural changes and we're left dealing with technical debt.



*Regards,*

*Roopa Tangirala*

Engineering Manager CDE

*(408) 438-3156 - mobile*






On Mon, Aug 27, 2018 at 10:49 PM Dinesh Joshi
 wrote:

> > On Aug 27, 2018, at 5:36 PM, Jonathan Haddad  wrote:
> > We're hoping to get some feedback on our side if that's something people
> > are interested in.  We've gone back and forth privately on our own
> > preferences, hopes, dreams, etc, but I feel like a public discussion
> would
> > be healthy at this point.  Does anyone share the view of using Reaper as
> a
> > starting point?  What concerns to people have?
>
>
> I have briefly looked at the Reaper codebase but I am yet to analyze it
> better to have a real, meaningful opinion.
>
> My main concern with starting with an existing codebase is that it comes
> with tech debt. This is not specific to Reaper but to any codebase that is
> imported as a whole. This means future developers and patches have to work
> within the confines of the decisions that were already made. Practically
> speaking once a codebase is established there is inertia in making
> architectural changes and we're left dealing with technical debt.
>
> As it stands I am not against the idea of using Reaper's features and I
> would very much like using mature code that has been tested. I would
> however like to propose piece-mealing it into the codebase. This will give
> the community a chance to review what is going in and possibly change some
> of the design decisions upfront. This will also avoid a situation where we
> have to make many breaking changes in the initial versions due to
> refactoring.
>
> I would also like it if we could compare and contrast the functionality
> with Priam or any other interesting sidecars that folks may want to call
> out. In fact it would be great if we could bring in the best functionality
> from multiple implementations.
>
> Dinesh
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Yet another repair solution

2018-08-28 Thread Roopa
+1 interested in seeing and understanding another repair solution. 

> On Aug 28, 2018, at 1:03 PM, Joseph Lynch  wrote:
> 
> I'm pretty interested in seeing and understanding your solution! When we
> started on CASSANDRA-14346 reading your design documents and plan you
> sketched out in CASSANDRA-10070 were really helpful in improving our
> design. I'm particularly interested in how the Scheduler/Job/Task APIs
> turned out (we're working on something similar internally and would love to
> compare notes and figure out the best way to implement that kind of
> abstraction)?
> 
> -Joey
> 
> 
> On Tue, Aug 28, 2018 at 6:34 AM Marcus Olsson 
> wrote:
> 
>> Hi,
>> 
>> With the risk of stirring the repair/side-car topic  even further I'd just
>> like to mention that we have recently gotten approval to contribute our
>> repair management side-car solution.
>> It's based on the proposal in
>> https://issues.apache.org/jira/browse/CASSANDRA-10070 as a standalone
>> application sitting next to each instance.
>> With the recent discussions in mind I'd just like to hear the thoughts
>> from the community on this before we put in the effort of bringing our
>> solution into open source.
>> 
>> Would there be an interest of having yet another repair solution in the
>> discussion?
>> 
>> Best Regards
>> Marcus Olsson
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposing an Apache Cassandra Management process

2018-08-20 Thread Roopa Tangirala
+1 to everything that Joey articulated with emphasis on the fact that
contributions should be evaluated based on the merit of code and their
value add to the whole offering. I  hope it does not matter whether that
contribution comes from PMC member or a person who is not a committer. I
would like the process to be such that it encourages the new members to be
a part of the community and not shy away from contributing to the code
assuming their contributions are valued differently than committers or PMC
members. It would be sad to see the contributions decrease if we go down
that path.

*Regards,*

*Roopa Tangirala*

Engineering Manager CDE

*(408) 438-3156 - mobile*






On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch  wrote:

> > We are looking to contribute Reaper to the Cassandra project.
> >
> Just to clarify are you proposing contributing Reaper as a project via
> donation or you are planning on contributing the features of Reaper as
> patches to Cassandra? If the former how far along are you on the donation
> process? If the latter, when do you think you would have patches ready for
> consideration / review?
>
>
> > Looking at the patch it's very similar in its base design already, but
> > Reaper does has a lot more to offer. We have all been working hard to
> move
> > it to also being a side-car so it can be contributed. This raises a
> number
> > of relevant questions to this thread: would we then accept both works in
> > the Cassandra project, and what burden would it put on the current PMC to
> > maintain both works.
> >
> I would hope that we would collaborate on merging the best parts of all
> into the official Cassandra sidecar, taking the always on, shared nothing,
> highly available system that we've contributed a patchset for and adding in
> many of the repair features (e.g. schedules, a nice web UI) that Reaper
> has.
>
>
> > I share Stefan's concern that consensus had not been met around a
> > side-car, and that it was somehow default accepted before a patch landed.
>
>
> I feel this is not correct or fair. The sidecar and repair discussions have
> been anything _but_ "default accepted". The timeline of consensus building
> involving the management sidecar and repair scheduling plans:
>
> Dec 2016: Vinay worked with Jon and Alex to try to collaborate on Reaper to
> come up with design goals for a repair scheduler that could work at Netflix
> scale.
>
> ~Feb 2017: Netflix believes that the fundamental design gaps prevented us
> from using Reaper as it relies heavily on remote JMX connections and
> central coordination.
>
> Sep. 2017: Vinay gives a lightning talk at NGCC about a highly available
> and distributed repair scheduling sidecar/tool. He is encouraged by
> multiple committers to build repair scheduling into the daemon itself and
> not as a sidecar so the database is truly eventually consistent.
>
> ~Jun. 2017 - Feb. 2018: Based on internal need and the positive feedback at
> NGCC, Vinay and myself prototype the distributed repair scheduler within
> Priam and roll it out at Netflix scale.
>
> Mar. 2018: I open a Jira (CASSANDRA-14346) along with a detailed 20 page
> design document for adding repair scheduling to the daemon itself and open
> the design up for feedback from the community. We get feedback from Alex,
> Blake, Nate, Stefan, and Mick. As far as I know there were zero proposals
> to contribute Reaper at this point. We hear the consensus that the
> community would prefer repair scheduling in a separate distributed sidecar
> rather than in the daemon itself and we re-work the design to match this
> consensus, re-aligning with our original proposal at NGCC.
>
> Apr 2018: Blake brings the discussion of repair scheduling to the dev list
> (
>
> https://lists.apache.org/thread.html/760fbef677f27aa5c2ab4c375c7efeb81304fea428deff986ba1c2eb@%3Cdev.cassandra.apache.org%3E
> ).
> Many community members give positive feedback that we should solve it as
> part of Cassandra and there is still no mention of contributing Reaper at
> this point. The last message is my attempted summary giving context on how
> we want to take the best of all the sidecars (OpsCenter, Priam, Reaper) and
> ship them with Cassandra.
>
> Apr. 2018: Dinesh opens CASSANDRA-14395 along with a public design document
> for gathering feedback on a general management sidecar. Sankalp and Dinesh
> encourage Vinay and myself to kickstart that sidecar using the repair
> scheduler patch
>
> Apr 2018: Dinesh reaches out to the dev list (
>
> https://lists.apache.org/thread.html/a098341efd8f344494bcd2761dba5125e971b59b1dd54f282ffda253@%3Cdev.cassandra.apache.org%3E
> )
> about the general management process to gain further feedback. All feedback

Re: Proposing an Apache Cassandra Management process

2018-08-17 Thread Roopa
+1 in maintaining a separate project and release cycle for side car. We have 
been running side car in production for 6+ years and the rate of changes to the 
side car is much higher than to the actual data store. This will enable faster 
iteration needed for the side car and help folks roll out maintenance fixes 
easily. 

Thanks
Roopa



> On Aug 17, 2018, at 8:52 AM, Blake Eggleston  wrote:
> 
> I'd be more in favor of making it a separate project, basically for all the 
> reasons listed below. I'm assuming we'd want a management process to work 
> across different versions, which will be more awkward if it's in tree. Even 
> if that's not the case, keeping it in a different repo at this point will 
> make iteration easier than if it were in tree. I'd imagine (or at least hope) 
> that validating the management process for release would be less difficult 
> than the main project, so tying them to the Cassandra release cycle seems 
> unnecessarily restrictive.
> 
> 
>> On August 17, 2018 at 12:07:18 AM, Dinesh Joshi 
>> (dinesh.jo...@yahoo.com.invalid) wrote:
>> 
>> On Aug 16, 2018, at 9:27 PM, Sankalp Kohli  wrote: 
>> 
>> I am bumping this thread because patch has landed for this with repair 
>> functionality. 
>> 
>> I have a following proposal for this which I can put in the JIRA or doc 
>> 
>> 1. We should see if we can keep this in a separate repo like Dtest. 
> 
> This would imply a looser coupling between the two. Keeping things in-tree is 
> my preferred approach. It makes testing, dependency management and code 
> sharing easier. 
> 
>> 2. It should have its own release process. 
> 
> This means now there would be two releases that need to be managed and 
> coordinated. 
> 
>> 3. It should have integration tests for different versions of Cassandra it 
>> will support. 
> 
> Given the lack of test infrastructure - this will be hard especially if you 
> have to qualify a matrix of builds. 
> 
> Dinesh 
> - 
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org 
> For additional commands, e-mail: dev-h...@cassandra.apache.org 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Branching Change for 4.0 Freeze

2018-07-13 Thread Roopa
+ 1

Thanks,
Roopa Tangirala



> On Jul 13, 2018, at 1:38 PM, Nate McCall  wrote:
> 
> +1
> I appreciate Gary's points, but if it's not working and/or we have a
> specific issue, we'll address it.
> 
> 
> 
> 
>> On Thu, Jul 12, 2018 at 9:46 AM, sankalp kohli  
>> wrote:
>> Hi,
>>As discussed in the thread[1], we are proposing that we will not branch
>> on 1st September but will only allow following merges into trunk.
>> 
>> a. Bug and Perf fixes to 4.0.
>> b. Critical bugs in any version of C*.
>> c. Testing changes to help test 4.0
>> 
>> If someone has a change which does not fall under these three, we can
>> always discuss it and have an exception.
>> 
>> Vote will be open for 72 hours.
>> 
>> Thanks,
>> Sankalp
>> 
>> [1]
>> https://lists.apache.org/thread.html/494c3ced9e83ceeb53fa127e44eec6e2588a01b769896b25867fd59f@%3Cdev.cassandra.apache.org%3E
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Repair scheduling tools

2018-04-03 Thread Roopa Tangirala
In seeing so many companies grapple with running repairs successfully in
production, and seeing the success of distributed scheduled repair here at
Netflix, I strongly believe that adding this to Cassandra would be a great
addition to the database.  I am hoping, we as a community will make it easy
for teams to operate and run Cassandra by enhancing the core product, and
making the maintenances like repairs and compactions part of the database
without external tooling. We can have an experimental flag for the feature
and only teams who are confident with the service can enable them, while
others can fall back to default repairs.


*Regards,*

*Roopa Tangirala*

Engineering Manager CDE

*(408) 438-3156 - mobile*





On Tue, Apr 3, 2018 at 4:19 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

> Why not make it configurable?
> auto_manage_repair_consistancy: true (default: false)
>
> Then users can use the built in auto repair function that would be created
> or continue to handle it as now.  Default behavior would be "false" so
> nothing changes on its own.  Just wondering why not have that option?  It
> might accelerate progress as others have already suggested.
>
> Kenneth Brotman
>
> -Original Message-
> From: Nate McCall [mailto:zznat...@gmail.com]
> Sent: Tuesday, April 03, 2018 1:37 PM
> To: dev
> Subject: Re: Repair scheduling tools
>
> This document does a really good job of listing out some of the issues of
> coordinating scheduling repair. Regardless of which camp you fall into, it
> is certainly worth a read.
>
> On Wed, Apr 4, 2018 at 8:10 AM, Joseph Lynch <joe.e.ly...@gmail.com>
> wrote:
> > I just want to say I think it would be great for our users if we moved
> > repair scheduling into Cassandra itself. The team here at Netflix has
> > opened the ticket
> > <https://issues.apache.org/jira/browse/CASSANDRA-14346>
> > and have written a detailed design document
> > <https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9G
> > bFSEyGzEtM/edit#heading=h.iasguic42ger>
> > that includes problem discussion and prior art if anyone wants to
> > contribute to that. We tried to fairly discuss existing solutions,
> > what their drawbacks are, and a proposed solution.
> >
> > If we were to put this as part of the main Cassandra daemon, I think
> > it should probably be marked experimental and of course be something
> > that users opt into (table by table or cluster by cluster) with the
> > understanding that it might not fully work out of the box the first
> > time we ship it. We have to be willing to take risks but we also have
> > to be honest with our users. It may help build confidence if a few
> > major deployments use it (such as Netflix) and we are happy of course
> > to provide that QA as best we can.
> >
> > -Joey
> >
> > On Tue, Apr 3, 2018 at 10:48 AM, Blake Eggleston
> > <beggles...@apple.com>
> > wrote:
> >
> >> Hi dev@,
> >>
> >>
> >>
> >> The question of the best way to schedule repairs came up on
> >> CASSANDRA-14346, and I thought it would be good to bring up the idea
> >> of an external tool on the dev list.
> >>
> >>
> >>
> >> Cassandra lacks any sort of tools for automating routine tasks that
> >> are required for running clusters, specifically repair. Regular
> >> repair is a must for most clusters, like compaction. This means that,
> >> especially as far as eventual consistency is concerned, Cassandra
> >> isn’t totally functional out of the box. Operators either need to
> >> find a 3rd party solution or implement one themselves. Adding this to
> >> Cassandra would make it easier to use.
> >>
> >>
> >>
> >> Is this something we should be doing? If so, what should it look like?
> >>
> >>
> >>
> >> Personally, I feel like this is a pretty big gap in the project and
> >> would like to see an out of process tool offered. Ideally, Cassandra
> >> would just take care of itself, but writing a distributed repair
> >> scheduler that you trust to run in production is a lot harder than
> >> writing a single process management application that can failover.
> >>
> >>
> >>
> >> Any thoughts on this?
> >>
> >>
> >>
> >> Thanks,
> >>
> >>
> >>
> >> Blake
> >>
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: New committers announcement

2017-02-15 Thread Roopa Tangirala
Congratulations to all the new committers!


*Regards,*

*Roopa Tangirala*

Engineering Manager CDE

*(408) 438-3156 - mobile*





On Wed, Feb 15, 2017 at 7:48 AM, Edward Capriolo <edlinuxg...@gmail.com>
wrote:

> Three cheers!
> Hip , Hip, NotFound
> 1 ms later
> Hip, Hip, Hooray
> 1 ms later
> Hooray, Hooray, Hooray
>
> On Tue, Feb 14, 2017 at 5:50 PM, Ben Bromhead <b...@instaclustr.com> wrote:
>
> > Congrats!!
> >
> > On Tue, 14 Feb 2017 at 13:37 Joaquin Casares <joaq...@thelastpickle.com>
> > wrote:
> >
> > > Congratulations!
> > >
> > > +1 John's sentiments. That's a great list of new committers! :)
> > >
> > > Joaquin Casares
> > > Consultant
> > > Austin, TX
> > >
> > > Apache Cassandra Consulting
> > > http://www.thelastpickle.com
> > >
> > > On Tue, Feb 14, 2017 at 3:34 PM, Jonathan Haddad <j...@jonhaddad.com>
> > > wrote:
> > >
> > > > Congratulations! Definitely a lot of great contributions from
> everyone
> > on
> > > > the list.
> > > > On Tue, Feb 14, 2017 at 1:31 PM Jason Brown <jasedbr...@gmail.com>
> > > wrote:
> > > >
> > > > > Hello all,
> > > > >
> > > > > It's raining new committers here in Apache Cassandra!  I'd like to
> > > > announce
> > > > > the following individuals are now committers for the project:
> > > > >
> > > > > Branimir Lambov
> > > > > Paulo Motta
> > > > > Stefan Pokowinski
> > > > > Ariel Weisberg
> > > > > Blake Eggleston
> > > > > Alex Petrov
> > > > > Joel Knighton
> > > > >
> > > > > Congratulations all! Please keep the excellent contributions
> coming.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > -Jason Brown
> > > > >
> > > >
> > >
> > --
> > Ben Bromhead
> > CTO | Instaclustr <https://www.instaclustr.com/>
> > +1 650 284 9692
> > Managed Cassandra / Spark on AWS, Azure and Softlayer
> >
>


Re: Dropped messages on random nodes.

2017-01-23 Thread Roopa Tangirala
Dikang,

Did you take a look at the heap health on those nodes? A quick heap
histogram or dump would help you figure out if it is related to data
issue(wide rows, or bad model)  where few nodes may be coming under heap
pressure and dropping messages.

Thanks,
Roopa



*Regards,*

*Roopa Tangirala*

Engineering Manager CDE

*(408) 438-3156 - mobile*





On Mon, Jan 23, 2017 at 4:55 PM, Blake Eggleston <beggles...@apple.com>
wrote:

> Hi Dikang,
>
> Do you have any GC logging or metrics you can correlate with the dropped
> messages? A 13 second pause sounds like a bad GC pause.
>
> Thanks,
>
> Blake
>
>
> On January 22, 2017 at 10:37:22 PM, Dikang Gu (dikan...@gmail.com) wrote:
>
> Btw, the C* version is 2.2.5, with several backported patches.
>
> On Sun, Jan 22, 2017 at 10:36 PM, Dikang Gu <dikan...@gmail.com> wrote:
>
> > Hello there,
> >
> > We have a 100 nodes ish cluster, I find that there are dropped messages
> on
> > random nodes in the cluster, which caused error spikes and P99 latency
> > spikes as well.
> >
> > I tried to figure out the cause. I do not see any obvious bottleneck in
> > the cluster, the C* nodes still have plenty of cpu idle/disk io. But I do
> > see some suspicious gossip events around that time, not sure if it's
> > related.
> >
> > 2017-01-21_16:43:56.71033 WARN 16:43:56 [GossipTasks:1]: Not marking
> > nodes down due to local pause of 13079498815 > 50
> > 2017-01-21_16:43:56.85532 INFO 16:43:56 [ScheduledTasks:1]: MUTATION
> > messages were dropped in last 5000 ms: 65 for internal timeout and 10895
> > for cross node timeout
> > 2017-01-21_16:43:56.85533 INFO 16:43:56 [ScheduledTasks:1]: READ messages
> > were dropped in last 5000 ms: 33 for internal timeout and 7867 for cross
> > node timeout
> > 2017-01-21_16:43:56.85534 INFO 16:43:56 [ScheduledTasks:1]: Pool Name
> > Active Pending Completed Blocked All Time Blocked
> > 2017-01-21_16:43:56.85534 INFO 16:43:56 [ScheduledTasks:1]: MutationStage
> > 128 47794 1015525068 0 0
> > 2017-01-21_16:43:56.85535
> > 2017-01-21_16:43:56.85535 INFO 16:43:56 [ScheduledTasks:1]: ReadStage
> > 64 20202 450508940 0 0
> >
> > Any suggestions?
> >
> > Thanks!
> >
> > --
> > Dikang
> >
> >
>
>
> --
> Dikang
>