Re: Can we kill the wiki?

2017-03-17 Thread Murukesh Mohanan
I wonder if the recent influx has anything to do with GSoC. The student
application period begins in a few days. I don't see any Cassandra issues
on the GSoC ideas list, though.

On Sat, 18 Mar 2017 at 10:40 Anthony Grasso 
wrote:

+1 to killing the wiki as well. If that is not possible, we should at least
put a note on there saying it is deprecated and point people to the new
docs.

On 18 March 2017 at 08:09, Jonathan Haddad  wrote:

> +1 to killing the wiki.
>
> On Fri, Mar 17, 2017 at 2:08 PM Blake Eggleston 
> wrote:
>
> > With CASSANDRA-8700, docs were moved in tree, with the intention that
> they
> > would replace the wiki. However, it looks like we’re still getting
> regular
> > requests to edit the wiki. It seems like we should be directing these
> folks
> > to the in tree docs and either disabling edits for the wiki, or just
> > removing it entirely, and replacing it with a link to the hosted docs.
> I'd
> > prefer we just remove it myself, makes things less confusing for
> newcomers.
> >
> > Does that seem reasonable to everyone?
>

-- 

Murukesh Mohanan,
Yahoo! Japan


Re: Can we kill the wiki?

2017-03-17 Thread Anthony Grasso
+1 to killing the wiki as well. If that is not possible, we should at least
put a note on there saying it is deprecated and point people to the new
docs.

On 18 March 2017 at 08:09, Jonathan Haddad  wrote:

> +1 to killing the wiki.
>
> On Fri, Mar 17, 2017 at 2:08 PM Blake Eggleston 
> wrote:
>
> > With CASSANDRA-8700, docs were moved in tree, with the intention that
> they
> > would replace the wiki. However, it looks like we’re still getting
> regular
> > requests to edit the wiki. It seems like we should be directing these
> folks
> > to the in tree docs and either disabling edits for the wiki, or just
> > removing it entirely, and replacing it with a link to the hosted docs.
> I'd
> > prefer we just remove it myself, makes things less confusing for
> newcomers.
> >
> > Does that seem reasonable to everyone?
>


Re: Code quality, principles and rules

2017-03-17 Thread Jeremy Hanna
https://issues.apache.org/jira/browse/CASSANDRA-7837 may be some interesting 
context regarding what's been worked on to get rid of singletons and static 
initialization.

> On Mar 17, 2017, at 4:47 PM, Jonathan Haddad  wrote:
> 
> I'd like to think that if someone refactors existing code, making it more
> testable (with tests, of course) it should be acceptable on it's own
> merit.  In fact, in my opinion it sometimes makes more sense to do these
> types of refactorings for the sole purpose of improving stability and
> testability as opposed to mixing them with features.
> 
> You referenced the issue I fixed in one of the early emails.  The fix
> itself was a couple lines of code.  Refactoring the codebase to make it
> testable would have been a huge effort.  I wish I had time to do it.  I
> created CASSANDRA-13007 as a follow up with the intent of working on
> compaction from a purely architectural standpoint.  I think this type of
> thing should be done throughout the codebase.
> 
> Removing the singletons is a good first step, my vote is we just rip off
> the bandaid, do it, and move forward.
> 
> On Fri, Mar 17, 2017 at 2:20 PM Edward Capriolo 
> wrote:
> 
>>> On Fri, Mar 17, 2017 at 2:31 PM, Jason Brown  wrote:
>>> 
>>> To François's point about code coverage for new code, I think this makes
>> a
>>> lot of sense wrt large features (like the current work on
>> 8457/12229/9754).
>>> It's much simpler to (mentally, at least) isolate those changed sections
>>> and it'll show up better in a code coverage report. With small patches,
>>> that might be harder to achieve - however, as the patch should come with
>>> *some* tests (unless it's a truly trivial patch), it might just work
>> itself
>>> out.
>>> 
>>> On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown 
>>> wrote:
>>> 
 As someone who spent a lot of time looking at the singletons topic in
>> the
 past, Blake brings a great perspective here. Figuring out and
>>> communicating
 how best to test with the system we have (and of course incrementally
 making that system easier to work with/test) seems like an achievable
>>> goal.
 
 On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <
>> edlinuxg...@gmail.com
 
 wrote:
 
> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <
>> beggles...@apple.com
 
> wrote:
> 
>> I think we’re getting a little ahead of ourselves talking about DI
>> frameworks. Before that even becomes something worth talking about,
>>> we’d
>> need to have made serious progress on un-spaghettifying Cassandra in
>>> the
>> first place. It’s an extremely tall order. Adding a DI framework
>> right
> now
>> would be like throwing gasoline on a raging tire fire.
>> 
>> Removing singletons seems to come up every 6-12 months, and usually
>> abandoned once people figure out how difficult they are to remove
> properly.
>> I do think removing them *should* be a long term goal, but we really
> need
>> something more immediately actionable. Otherwise, nothing’s going to
>> happen, and we’ll be having this discussion again in a year or so
>> when
>> everyone’s angry that Cassandra 5.0 still isn’t ready for
>> production,
>>> a
>> year after it’s release.
>> 
>> That said, the reason singletons regularly get brought up is because
> doing
>> extensive testing of anything in Cassandra is pretty much
>> impossible,
> since
>> the code is basically this big web of interconnected global state.
> Testing
>> anything in isolation can’t be done, which, for a distributed
>>> database,
> is
>> crazy. It’s a chronic problem that handicaps our ability to release
>> a
>> stable database.
>> 
>> At this point, I think a more pragmatic approach would be to draft
>> and
>> enforce some coding standards that can be applied in day to day
> development
>> that drive incremental improvement of the testing and testability of
>>> the
>> project. What should be tested, how it should be tested. How to
>> write
> new
>> code that talks to the rest of Cassandra and is testable. How to fix
> bugs
>> in old code in a way that’s testable. We should also have some
> guidelines
>> around refactoring the wildly untested sections, how to get started,
> what
>> to do, what not to do, etc.
>> 
>> Thoughts?
> 
> 
> To make the conversation practical. There is one class I personally
>>> really
> want to refactor so it can be tested:
> 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/
> apache/cassandra/net/OutboundTcpConnection.java
> 
> There is little coverage here. Questions like:
> what errors cause the connection to restart?
> when are undropable messages are dropped?
> what happens when the queue fills up?
> Infamous 

Re: Code quality, principles and rules

2017-03-17 Thread Jonathan Haddad
I'd like to think that if someone refactors existing code, making it more
testable (with tests, of course) it should be acceptable on it's own
merit.  In fact, in my opinion it sometimes makes more sense to do these
types of refactorings for the sole purpose of improving stability and
testability as opposed to mixing them with features.

You referenced the issue I fixed in one of the early emails.  The fix
itself was a couple lines of code.  Refactoring the codebase to make it
testable would have been a huge effort.  I wish I had time to do it.  I
created CASSANDRA-13007 as a follow up with the intent of working on
compaction from a purely architectural standpoint.  I think this type of
thing should be done throughout the codebase.

Removing the singletons is a good first step, my vote is we just rip off
the bandaid, do it, and move forward.

On Fri, Mar 17, 2017 at 2:20 PM Edward Capriolo 
wrote:

> On Fri, Mar 17, 2017 at 2:31 PM, Jason Brown  wrote:
>
> > To François's point about code coverage for new code, I think this makes
> a
> > lot of sense wrt large features (like the current work on
> 8457/12229/9754).
> > It's much simpler to (mentally, at least) isolate those changed sections
> > and it'll show up better in a code coverage report. With small patches,
> > that might be harder to achieve - however, as the patch should come with
> > *some* tests (unless it's a truly trivial patch), it might just work
> itself
> > out.
> >
> > On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown 
> > wrote:
> >
> > > As someone who spent a lot of time looking at the singletons topic in
> the
> > > past, Blake brings a great perspective here. Figuring out and
> > communicating
> > > how best to test with the system we have (and of course incrementally
> > > making that system easier to work with/test) seems like an achievable
> > goal.
> > >
> > > On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <
> edlinuxg...@gmail.com
> > >
> > > wrote:
> > >
> > >> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <
> beggles...@apple.com
> > >
> > >> wrote:
> > >>
> > >> > I think we’re getting a little ahead of ourselves talking about DI
> > >> > frameworks. Before that even becomes something worth talking about,
> > we’d
> > >> > need to have made serious progress on un-spaghettifying Cassandra in
> > the
> > >> > first place. It’s an extremely tall order. Adding a DI framework
> right
> > >> now
> > >> > would be like throwing gasoline on a raging tire fire.
> > >> >
> > >> > Removing singletons seems to come up every 6-12 months, and usually
> > >> > abandoned once people figure out how difficult they are to remove
> > >> properly.
> > >> > I do think removing them *should* be a long term goal, but we really
> > >> need
> > >> > something more immediately actionable. Otherwise, nothing’s going to
> > >> > happen, and we’ll be having this discussion again in a year or so
> when
> > >> > everyone’s angry that Cassandra 5.0 still isn’t ready for
> production,
> > a
> > >> > year after it’s release.
> > >> >
> > >> > That said, the reason singletons regularly get brought up is because
> > >> doing
> > >> > extensive testing of anything in Cassandra is pretty much
> impossible,
> > >> since
> > >> > the code is basically this big web of interconnected global state.
> > >> Testing
> > >> > anything in isolation can’t be done, which, for a distributed
> > database,
> > >> is
> > >> > crazy. It’s a chronic problem that handicaps our ability to release
> a
> > >> > stable database.
> > >> >
> > >> > At this point, I think a more pragmatic approach would be to draft
> and
> > >> > enforce some coding standards that can be applied in day to day
> > >> development
> > >> > that drive incremental improvement of the testing and testability of
> > the
> > >> > project. What should be tested, how it should be tested. How to
> write
> > >> new
> > >> > code that talks to the rest of Cassandra and is testable. How to fix
> > >> bugs
> > >> > in old code in a way that’s testable. We should also have some
> > >> guidelines
> > >> > around refactoring the wildly untested sections, how to get started,
> > >> what
> > >> > to do, what not to do, etc.
> > >> >
> > >> > Thoughts?
> > >>
> > >>
> > >> To make the conversation practical. There is one class I personally
> > really
> > >> want to refactor so it can be tested:
> > >>
> > >> https://github.com/apache/cassandra/blob/trunk/src/java/org/
> > >> apache/cassandra/net/OutboundTcpConnection.java
> > >>
> > >> There is little coverage here. Questions like:
> > >> what errors cause the connection to restart?
> > >> when are undropable messages are dropped?
> > >> what happens when the queue fills up?
> > >> Infamous throw new AssertionError(ex); (which probably bubble up to
> > >> nowhere)
> > >> what does the COALESCED strategy do in case XYZ.
> > >> A nifty label (wow a label you just never see those much!)
> > >> outer:
> > >> 

Re: Code quality, principles and rules

2017-03-17 Thread Edward Capriolo
On Fri, Mar 17, 2017 at 2:31 PM, Jason Brown  wrote:

> To François's point about code coverage for new code, I think this makes a
> lot of sense wrt large features (like the current work on 8457/12229/9754).
> It's much simpler to (mentally, at least) isolate those changed sections
> and it'll show up better in a code coverage report. With small patches,
> that might be harder to achieve - however, as the patch should come with
> *some* tests (unless it's a truly trivial patch), it might just work itself
> out.
>
> On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown 
> wrote:
>
> > As someone who spent a lot of time looking at the singletons topic in the
> > past, Blake brings a great perspective here. Figuring out and
> communicating
> > how best to test with the system we have (and of course incrementally
> > making that system easier to work with/test) seems like an achievable
> goal.
> >
> > On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo  >
> > wrote:
> >
> >> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston  >
> >> wrote:
> >>
> >> > I think we’re getting a little ahead of ourselves talking about DI
> >> > frameworks. Before that even becomes something worth talking about,
> we’d
> >> > need to have made serious progress on un-spaghettifying Cassandra in
> the
> >> > first place. It’s an extremely tall order. Adding a DI framework right
> >> now
> >> > would be like throwing gasoline on a raging tire fire.
> >> >
> >> > Removing singletons seems to come up every 6-12 months, and usually
> >> > abandoned once people figure out how difficult they are to remove
> >> properly.
> >> > I do think removing them *should* be a long term goal, but we really
> >> need
> >> > something more immediately actionable. Otherwise, nothing’s going to
> >> > happen, and we’ll be having this discussion again in a year or so when
> >> > everyone’s angry that Cassandra 5.0 still isn’t ready for production,
> a
> >> > year after it’s release.
> >> >
> >> > That said, the reason singletons regularly get brought up is because
> >> doing
> >> > extensive testing of anything in Cassandra is pretty much impossible,
> >> since
> >> > the code is basically this big web of interconnected global state.
> >> Testing
> >> > anything in isolation can’t be done, which, for a distributed
> database,
> >> is
> >> > crazy. It’s a chronic problem that handicaps our ability to release a
> >> > stable database.
> >> >
> >> > At this point, I think a more pragmatic approach would be to draft and
> >> > enforce some coding standards that can be applied in day to day
> >> development
> >> > that drive incremental improvement of the testing and testability of
> the
> >> > project. What should be tested, how it should be tested. How to write
> >> new
> >> > code that talks to the rest of Cassandra and is testable. How to fix
> >> bugs
> >> > in old code in a way that’s testable. We should also have some
> >> guidelines
> >> > around refactoring the wildly untested sections, how to get started,
> >> what
> >> > to do, what not to do, etc.
> >> >
> >> > Thoughts?
> >>
> >>
> >> To make the conversation practical. There is one class I personally
> really
> >> want to refactor so it can be tested:
> >>
> >> https://github.com/apache/cassandra/blob/trunk/src/java/org/
> >> apache/cassandra/net/OutboundTcpConnection.java
> >>
> >> There is little coverage here. Questions like:
> >> what errors cause the connection to restart?
> >> when are undropable messages are dropped?
> >> what happens when the queue fills up?
> >> Infamous throw new AssertionError(ex); (which probably bubble up to
> >> nowhere)
> >> what does the COALESCED strategy do in case XYZ.
> >> A nifty label (wow a label you just never see those much!)
> >> outer:
> >> while (!isStopped)
> >>
> >> Comments to jira's that probably are not explicitly tested:
> >> // If we haven't retried this message yet, put it back on the queue to
> >> retry after re-connecting.
> >> // See CASSANDRA-5393 and CASSANDRA-12192.
> >>
> >> If I were to undertake this cleanup, would there actually be support? IE
> >> if
> >> this going to turn into an "it aint broken. don't fix it thing" or a "we
> >> don't want to change stuff just to add tests" . Like will someone pledge
> >> to
> >> agree its kinda wonky and merge the effort in < 1 years time?
> >>
> >
> >
>

So ...:) If open a ticket to refactor OutboundTcpConnection.java to do
specific unit testing and possibly even pull things out to the point that I
can actually open a socket and to an end to end test will you/anyone
support that? (it sounds like your saying I must/should make a large
feature to add a test)


Re: Can we kill the wiki?

2017-03-17 Thread Jonathan Haddad
+1 to killing the wiki.

On Fri, Mar 17, 2017 at 2:08 PM Blake Eggleston 
wrote:

> With CASSANDRA-8700, docs were moved in tree, with the intention that they
> would replace the wiki. However, it looks like we’re still getting regular
> requests to edit the wiki. It seems like we should be directing these folks
> to the in tree docs and either disabling edits for the wiki, or just
> removing it entirely, and replacing it with a link to the hosted docs. I'd
> prefer we just remove it myself, makes things less confusing for newcomers.
>
> Does that seem reasonable to everyone?


Can we kill the wiki?

2017-03-17 Thread Blake Eggleston
With CASSANDRA-8700, docs were moved in tree, with the intention that they 
would replace the wiki. However, it looks like we’re still getting regular 
requests to edit the wiki. It seems like we should be directing these folks to 
the in tree docs and either disabling edits for the wiki, or just removing it 
entirely, and replacing it with a link to the hosted docs. I'd prefer we just 
remove it myself, makes things less confusing for newcomers.

Does that seem reasonable to everyone?

Re: Documentation contributors guide

2017-03-17 Thread Jeff Jirsa
> > On 2017-03-17 12:33 (-0700), Stefan Podkowinski  wrote: 
> > 
> >> As you can see there's a large part about using GitHub for editing on
> >> the page. I'd like to know what you think about that and if you'd agree
> >> to accept PRs for such purposes.

I don't want to bury the important point in minutiae of actually committing:

Your doc/howto on docs is awesome. Much needed and welcome addition to the docs.









Re: Documentation contributors guide

2017-03-17 Thread Jeff Jirsa


On 2017-03-17 13:06 (-0700), benjamin roth  wrote: 
> Isn't there a way to script that with just a few lines of python or
> whatever?

For docs, probably. Real patches are harder.  

There's a minor problem that they're a bit spammy (all PRs create dev@ emails), 
but I'd rather tolerate that noise than discourage contributors, so I 
(personally, not sure if rest of the committers/PMC agree) encourage people to 
send Github PRs if it's the only way they can contribute.




Re: Documentation contributors guide

2017-03-17 Thread Stefan Podkowinski
I don't see how that would be harder compared to merging a patch
attached to a jira ticket. If you'd want to merge my PR you'd just have
to do something like that:

curl -o docs.patch
https://github.com/apache/cassandra/compare/trunk...spodkowinski:docs_gettingstarted.patch
git am docs.patch
git reset --soft origin/trunk
git commit (add proper commit message and a "Merges #" text to
automatically close the PR)



On 03/17/2017 09:03 PM, Jeff Jirsa wrote:
> 
> 
> On 2017-03-17 12:33 (-0700), Stefan Podkowinski  wrote: 
> 
>> As you can see there's a large part about using GitHub for editing on
>> the page. I'd like to know what you think about that and if you'd agree
>> to accept PRs for such purposes.
>>
> 
> The challenge of github PRs isn't that we don't want them, it's that we can't 
> merge them - the apache github repo is a read only mirror (the master is on 
> ASF infrastructure). 
> 
> Personally, I'd rather have a Github PR than no patch, but I'd much rather 
> have a JIRA patch than a Github PR, because ultimately the committer is going 
> to have to manually transform the Github PR into a .patch file and commit it 
> with a special commit message to close the Github PR (or hope that the 
> contributor closes it for us, because committers can't even close PRs at this 
> point). 
> 
>> I'd also like to add another section for committers that describes the
>> required steps to actually publish the latest trunk to our website. I
>> know that svn has been mentioned somewhere, but I would appreciate if
>> someone either adds that section or just shares some details in this thread.
> 
> The repo is at https://svn.apache.org/repos/asf/cassandra/ - there's a doc at 
> https://svn.apache.org/repos/asf/cassandra/site/src/README that describes it. 
> 


Re: Documentation contributors guide

2017-03-17 Thread benjamin roth
Isn't there a way to script that with just a few lines of python or
whatever?

2017-03-17 21:03 GMT+01:00 Jeff Jirsa :

>
>
> On 2017-03-17 12:33 (-0700), Stefan Podkowinski  wrote:
>
> > As you can see there's a large part about using GitHub for editing on
> > the page. I'd like to know what you think about that and if you'd agree
> > to accept PRs for such purposes.
> >
>
> The challenge of github PRs isn't that we don't want them, it's that we
> can't merge them - the apache github repo is a read only mirror (the master
> is on ASF infrastructure).
>
> Personally, I'd rather have a Github PR than no patch, but I'd much rather
> have a JIRA patch than a Github PR, because ultimately the committer is
> going to have to manually transform the Github PR into a .patch file and
> commit it with a special commit message to close the Github PR (or hope
> that the contributor closes it for us, because committers can't even close
> PRs at this point).
>
> > I'd also like to add another section for committers that describes the
> > required steps to actually publish the latest trunk to our website. I
> > know that svn has been mentioned somewhere, but I would appreciate if
> > someone either adds that section or just shares some details in this
> thread.
>
> The repo is at https://svn.apache.org/repos/asf/cassandra/ - there's a
> doc at https://svn.apache.org/repos/asf/cassandra/site/src/README that
> describes it.
>


Re: Documentation contributors guide

2017-03-17 Thread Jeff Jirsa


On 2017-03-17 12:33 (-0700), Stefan Podkowinski  wrote: 

> As you can see there's a large part about using GitHub for editing on
> the page. I'd like to know what you think about that and if you'd agree
> to accept PRs for such purposes.
> 

The challenge of github PRs isn't that we don't want them, it's that we can't 
merge them - the apache github repo is a read only mirror (the master is on ASF 
infrastructure). 

Personally, I'd rather have a Github PR than no patch, but I'd much rather have 
a JIRA patch than a Github PR, because ultimately the committer is going to 
have to manually transform the Github PR into a .patch file and commit it with 
a special commit message to close the Github PR (or hope that the contributor 
closes it for us, because committers can't even close PRs at this point). 

> I'd also like to add another section for committers that describes the
> required steps to actually publish the latest trunk to our website. I
> know that svn has been mentioned somewhere, but I would appreciate if
> someone either adds that section or just shares some details in this thread.

The repo is at https://svn.apache.org/repos/asf/cassandra/ - there's a doc at 
https://svn.apache.org/repos/asf/cassandra/site/src/README that describes it. 


Documentation contributors guide

2017-03-17 Thread Stefan Podkowinski
There's recently been a discussion about the wiki and how we should
continue to work on the documentation in general. One of my suggestions
was to start giving users a clearer guideline how they are able to
contribute to our documentation, before having a technical discussion
around tools and wikis again.

I've now created a first version on such a guide that can be found here:
https://github.com/spodkowinski/cassandra/blob/docs_gettingstarted/doc/source/development/documentation.rst

As you can see there's a large part about using GitHub for editing on
the page. I'd like to know what you think about that and if you'd agree
to accept PRs for such purposes.

I'd also like to add another section for committers that describes the
required steps to actually publish the latest trunk to our website. I
know that svn has been mentioned somewhere, but I would appreciate if
someone either adds that section or just shares some details in this thread.

Cheers!



Re: Code quality, principles and rules

2017-03-17 Thread benjamin roth
I think you can refactor any project with little risk and increase test
coverage.
What is needed:
Rules. Discipline. Perseverance. Small iterations. Small iterations. Small
iterations.

   - Refactor in the smallest possible unit
   - Split large classes into smaller ones. Remove god classes by pulling
   out single methods or aspects. Maybe factor out method by method.
   - Maintain compatibility. Build facades, adapters, proxy objects for
   compatibility during refactoring process. Do not break interfaces if not
   really necessary or risky.
   - Push states into corners. E.g. when refactoring a single method, pass
   global state as parameter. So this single method becomes testable.

If you iterate like this maybe 1000 times, you will most likely break much
fewer things than doing a big bang refactor. You make code testable in
small steps.

Global state is the biggest disease, history of programming has ever seen.
Singletons are also not supergreat to test and static methods should be
avoided at all costs if they contain state.
Tested idempotent static methods should not be a problem.

>From my experience, you don't need a bloated DI framework to make a class
testable that depends somehow on static methods or singletons.
You just have to push the bad guys into a corner where they don't harm and
can be killed without risk in the very end.
E.g. instead of calling SomeClass.instance.doWhatEver() spread here and
there it can be encapsulated in a single method like
TestableClass.doWhatever() {SomeClass.instance.doWhatEver()}
Or the whole singleton is retrieved through TestableClass.getSomeClass().
So you can either mock the hell out of it or you inject a non-singleton
instance of that class at test-runtime.


2017-03-17 19:19 GMT+01:00 Jason Brown :

> As someone who spent a lot of time looking at the singletons topic in the
> past, Blake brings a great perspective here. Figuring out and communicating
> how best to test with the system we have (and of course incrementally
> making that system easier to work with/test) seems like an achievable goal.
>
> On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo 
> wrote:
>
> > On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston 
> > wrote:
> >
> > > I think we’re getting a little ahead of ourselves talking about DI
> > > frameworks. Before that even becomes something worth talking about,
> we’d
> > > need to have made serious progress on un-spaghettifying Cassandra in
> the
> > > first place. It’s an extremely tall order. Adding a DI framework right
> > now
> > > would be like throwing gasoline on a raging tire fire.
> > >
> > > Removing singletons seems to come up every 6-12 months, and usually
> > > abandoned once people figure out how difficult they are to remove
> > properly.
> > > I do think removing them *should* be a long term goal, but we really
> need
> > > something more immediately actionable. Otherwise, nothing’s going to
> > > happen, and we’ll be having this discussion again in a year or so when
> > > everyone’s angry that Cassandra 5.0 still isn’t ready for production, a
> > > year after it’s release.
> > >
> > > That said, the reason singletons regularly get brought up is because
> > doing
> > > extensive testing of anything in Cassandra is pretty much impossible,
> > since
> > > the code is basically this big web of interconnected global state.
> > Testing
> > > anything in isolation can’t be done, which, for a distributed database,
> > is
> > > crazy. It’s a chronic problem that handicaps our ability to release a
> > > stable database.
> > >
> > > At this point, I think a more pragmatic approach would be to draft and
> > > enforce some coding standards that can be applied in day to day
> > development
> > > that drive incremental improvement of the testing and testability of
> the
> > > project. What should be tested, how it should be tested. How to write
> new
> > > code that talks to the rest of Cassandra and is testable. How to fix
> bugs
> > > in old code in a way that’s testable. We should also have some
> guidelines
> > > around refactoring the wildly untested sections, how to get started,
> what
> > > to do, what not to do, etc.
> > >
> > > Thoughts?
> >
> >
> > To make the conversation practical. There is one class I personally
> really
> > want to refactor so it can be tested:
> >
> > https://github.com/apache/cassandra/blob/trunk/src/java/
> > org/apache/cassandra/net/OutboundTcpConnection.java
> >
> > There is little coverage here. Questions like:
> > what errors cause the connection to restart?
> > when are undropable messages are dropped?
> > what happens when the queue fills up?
> > Infamous throw new AssertionError(ex); (which probably bubble up to
> > nowhere)
> > what does the COALESCED strategy do in case XYZ.
> > A nifty label (wow a label you just never see those much!)
> > outer:
> > while (!isStopped)
> >
> > Comments to jira's that probably are not explicitly 

Re: Code quality, principles and rules

2017-03-17 Thread Jason Brown
To François's point about code coverage for new code, I think this makes a
lot of sense wrt large features (like the current work on 8457/12229/9754).
It's much simpler to (mentally, at least) isolate those changed sections
and it'll show up better in a code coverage report. With small patches,
that might be harder to achieve - however, as the patch should come with
*some* tests (unless it's a truly trivial patch), it might just work itself
out.

On Fri, Mar 17, 2017 at 11:19 AM, Jason Brown  wrote:

> As someone who spent a lot of time looking at the singletons topic in the
> past, Blake brings a great perspective here. Figuring out and communicating
> how best to test with the system we have (and of course incrementally
> making that system easier to work with/test) seems like an achievable goal.
>
> On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo 
> wrote:
>
>> On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston 
>> wrote:
>>
>> > I think we’re getting a little ahead of ourselves talking about DI
>> > frameworks. Before that even becomes something worth talking about, we’d
>> > need to have made serious progress on un-spaghettifying Cassandra in the
>> > first place. It’s an extremely tall order. Adding a DI framework right
>> now
>> > would be like throwing gasoline on a raging tire fire.
>> >
>> > Removing singletons seems to come up every 6-12 months, and usually
>> > abandoned once people figure out how difficult they are to remove
>> properly.
>> > I do think removing them *should* be a long term goal, but we really
>> need
>> > something more immediately actionable. Otherwise, nothing’s going to
>> > happen, and we’ll be having this discussion again in a year or so when
>> > everyone’s angry that Cassandra 5.0 still isn’t ready for production, a
>> > year after it’s release.
>> >
>> > That said, the reason singletons regularly get brought up is because
>> doing
>> > extensive testing of anything in Cassandra is pretty much impossible,
>> since
>> > the code is basically this big web of interconnected global state.
>> Testing
>> > anything in isolation can’t be done, which, for a distributed database,
>> is
>> > crazy. It’s a chronic problem that handicaps our ability to release a
>> > stable database.
>> >
>> > At this point, I think a more pragmatic approach would be to draft and
>> > enforce some coding standards that can be applied in day to day
>> development
>> > that drive incremental improvement of the testing and testability of the
>> > project. What should be tested, how it should be tested. How to write
>> new
>> > code that talks to the rest of Cassandra and is testable. How to fix
>> bugs
>> > in old code in a way that’s testable. We should also have some
>> guidelines
>> > around refactoring the wildly untested sections, how to get started,
>> what
>> > to do, what not to do, etc.
>> >
>> > Thoughts?
>>
>>
>> To make the conversation practical. There is one class I personally really
>> want to refactor so it can be tested:
>>
>> https://github.com/apache/cassandra/blob/trunk/src/java/org/
>> apache/cassandra/net/OutboundTcpConnection.java
>>
>> There is little coverage here. Questions like:
>> what errors cause the connection to restart?
>> when are undropable messages are dropped?
>> what happens when the queue fills up?
>> Infamous throw new AssertionError(ex); (which probably bubble up to
>> nowhere)
>> what does the COALESCED strategy do in case XYZ.
>> A nifty label (wow a label you just never see those much!)
>> outer:
>> while (!isStopped)
>>
>> Comments to jira's that probably are not explicitly tested:
>> // If we haven't retried this message yet, put it back on the queue to
>> retry after re-connecting.
>> // See CASSANDRA-5393 and CASSANDRA-12192.
>>
>> If I were to undertake this cleanup, would there actually be support? IE
>> if
>> this going to turn into an "it aint broken. don't fix it thing" or a "we
>> don't want to change stuff just to add tests" . Like will someone pledge
>> to
>> agree its kinda wonky and merge the effort in < 1 years time?
>>
>
>


Re: Contribute to the Cassandra Wiki

2017-03-17 Thread Dave Brosius

done

---


On 2017-03-17 06:02, Rahul S wrote:

Hi
I would like to contribute to the Cassandra wiki please. My username
is rahul3.

Thanks
Rahul

From Alia to Anushka, Arijit to Armaan, Papon to Pritam, Sonu to
Sukhwinder and Varun to Diljit - the biggest music and Bollywood stars
say Music ko Mirchi ka Salaam! Catch the Royal Stag Mirchi Music
Awards on 19th March at 8 PM on Zee TV.


Re: Code quality, principles and rules

2017-03-17 Thread Blake Eggleston
I think we’re getting a little ahead of ourselves talking about DI frameworks. 
Before that even becomes something worth talking about, we’d need to have made 
serious progress on un-spaghettifying Cassandra in the first place. It’s an 
extremely tall order. Adding a DI framework right now would be like throwing 
gasoline on a raging tire fire.

Removing singletons seems to come up every 6-12 months, and usually abandoned 
once people figure out how difficult they are to remove properly. I do think 
removing them *should* be a long term goal, but we really need something more 
immediately actionable. Otherwise, nothing’s going to happen, and we’ll be 
having this discussion again in a year or so when everyone’s angry that 
Cassandra 5.0 still isn’t ready for production, a year after it’s release.

That said, the reason singletons regularly get brought up is because doing 
extensive testing of anything in Cassandra is pretty much impossible, since the 
code is basically this big web of interconnected global state. Testing anything 
in isolation can’t be done, which, for a distributed database, is crazy. It’s a 
chronic problem that handicaps our ability to release a stable database.

At this point, I think a more pragmatic approach would be to draft and enforce 
some coding standards that can be applied in day to day development that drive 
incremental improvement of the testing and testability of the project. What 
should be tested, how it should be tested. How to write new code that talks to 
the rest of Cassandra and is testable. How to fix bugs in old code in a way 
that’s testable. We should also have some guidelines around refactoring the 
wildly untested sections, how to get started, what to do, what not to do, etc.

Thoughts?

Re: Code quality, principles and rules

2017-03-17 Thread Edward Capriolo
On Fri, Mar 17, 2017 at 6:41 AM, Ryan Svihla  wrote:

> Different DI frameworks have different initialization costs, even inside of
> spring even depending on how you wire up dependencies (did it use autowire
> with reflection, parse a giant XML of explicit dependencies, etc).
>
> To back this assertion up for awhile in that community benching different
> DI frameworks perf was a thing and you can find benchmarks galore with a
> quick Google.
>
> The practical cost is also dependent on the lifecycles used (transient
> versus Singleton style for example) and features used (Interceptors
> depending on implementation can get expensive).
>
> So I think there should be some quantification of cost before a framework
> is considered, something like dagger2 which uses codegen I wager is only a
> cost at compile time (have not benched it, but looking at it's feature set,
> that's my guess) , Spring I know from experience even with the most optimal
> settings is slower on initialization time than doing by DI "by hand" at
> minimum, and that can sometimes be substantial.
>
>
> On Mar 17, 2017 12:29 AM, "Edward Capriolo"  wrote:
>
> On Thu, Mar 16, 2017 at 5:18 PM, Jason Brown  wrote:
>
> > >> do we have plan to integrate with a dependency injection framework?
> >
> > No, we (the maintainers) have been pretty much against more frameworks
> due
> > to performance reasons, overhead, and dependency management problems.
> >
> > On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou 
> > wrote:
> >
> > > Since we're here, do we have plan to integrate with a dependency
> > injection
> > > framework like Dagger2? Otherwise it'll be difficult to write unit test
> > > cases.
> > >
> > > On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <
> edlinuxg...@gmail.com>
> > > wrote:
> > >
> > > > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa 
> wrote:
> > > >
> > > > >
> > > > >
> > > > > On 2017-03-16 10:32 (-0700), François Deliège <
> > franc...@instagram.com>
> > > > > wrote:
> > > > > >
> > > > > > To get this started, here is an initial proposal:
> > > > > >
> > > > > > Principles:
> > > > > >
> > > > > > 1. Tests always pass.  This is the starting point. If we don't
> care
> > > > > about test failures, then we should stop writing tests. A recurring
> > > > failing
> > > > > test carries no signal and is better deleted.
> > > > > > 2. The code is tested.
> > > > > >
> > > > > > Assuming we can align on these principles, here is a proposal for
> > > their
> > > > > implementation.
> > > > > >
> > > > > > Rules:
> > > > > >
> > > > > > 1. Each new release passes all tests (no flakinesss).
> > > > > > 2. If a patch has a failing test (test touching the same code
> > path),
> > > > the
> > > > > code or test should be fixed prior to being accepted.
> > > > > > 3. Bugs fixes should have one test that fails prior to the fix
> and
> > > > > passes after fix.
> > > > > > 4. New code should have at least 90% test coverage.
> > > > > >
> > > > > First I was
> > > > > I agree with all of these and hope they become codified and
> > followed. I
> > > > > don't know anyone who believes we should be committing code that
> > breaks
> > > > > tests - but we should be more strict with requiring green test
> runs,
> > > and
> > > > > perhaps more strict with reverting patches that break tests (or
> cause
> > > > them
> > > > > to be flakey).
> > > > >
> > > > > Ed also noted on the user list [0] that certain sections of the
> code
> > > > > itself are difficult to test because of singletons - I agree with
> the
> > > > > suggestion that it's time to revisit CASSANDRA-7837 and
> > CASSANDRA-10283
> > > > >
> > > > > Finally, we should also recall Jason's previous notes [1] that the
> > > actual
> > > > > test infrastructure available is limited - the system provided by
> > > > Datastax
> > > > > is not generally open to everyone (and not guaranteed to be
> > permanent),
> > > > and
> > > > > the infrastructure currently available to the ASF is somewhat
> limited
> > > > (much
> > > > > slower, at the very least). If we require tests passing (and I
> agree
> > > that
> > > > > we should), we need to define how we're going to be testing (or how
> > > we're
> > > > > going to be sharing test results), because the ASF hardware isn't
> > going
> > > > to
> > > > > be able to do dozens of dev branch dtest runs per day in its
> current
> > > > form.
> > > > >
> > > > > 0: https://lists.apache.org/thread.html/
> > f6f3fc6d0ad1bd54a6185ce7bd7a2f
> > > > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> > > > > 1: https://lists.apache.org/thread.html/
> > 5fb8f0446ab97644100e4ef987f36e
> > > > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
> > > > >
> > > > >
> > > > >
> > > > Ed also noted on the user list [0] that certain sections of the code
> > > itself
> > > > are difficult to test because of singletons - I agree with the
> > 

Contribute to the Cassandra Wiki

2017-03-17 Thread Rahul S
Hi
I would like to contribute to the Cassandra wiki please. My username is 
rahul3.

Thanks
Rahul

From Alia to Anushka, Arijit to Armaan, Papon to Pritam, Sonu to Sukhwinder and 
Varun to Diljit - the biggest music and Bollywood stars say Music ko Mirchi ka 
Salaam! Catch the Royal Stag Mirchi Music Awards on 19th March at 8 PM on Zee 
TV.