Re: New strategy for Netty (ZOOKEEPER-823) was: What's the QA strategy of ZooKeeper?
This sounds great. Thanks mahadev On 10/16/10 1:56 AM, Thomas Koch tho...@koch.ro wrote: Benjamin Reed: actually, the other way of doing the netty patch (since i'm scared of merges) would be to do a refactor cleanup patch with an eye toward netty, and then another patch to actually add netty. [...] Hi Benjamin, I've had exactly the same thought last evening. Instead of trying to find the bug(s) in the current patch, I'd like to start it over again and do small incremental changes from the current trunk towards the current ZOOKEEPER-823 patch. Maybe I could do this in ZOOKEEPER-823 patch, this would mean to revert the already applied ZOOKEEPER-823 patch. Then I want to test each incremental step at least 5 times to find the step(s) that breaks ZK. This approach should take me another two weeks, I believe, mostly because each Test run takes ~15-25 minutes. Cheers, Thomas Koch, http://www.koch.ro
Re: New strategy for Netty (ZOOKEEPER-823) was: What's the QA strategy of ZooKeeper?
On Sat, Oct 16, 2010 at 1:56 AM, Thomas Koch tho...@koch.ro wrote: Benjamin Reed: actually, the other way of doing the netty patch (since i'm scared of merges) would be to do a refactor cleanup patch with an eye toward netty, and then another patch to actually add netty. [...] Ben you really need to give git a try and stop fearing the branch/merge. ;-) Seriously though, having a branch is not a big deal. In the end you an create one or more patches if you like and apply them, but this is essentially just a merge. My main concern personally is that a branch not go on for too long or get too big, ie incorporate too many changes, not focused. I believe that's not the case here though. Thomas would focus on 1) refactoring the client code to enable netty integration, 2) integrate netty changes. He'd also be adding 3) significant tests (potentially refactoring some code to better allow design for test) to ensure that the code changes (incl refactoring) don't break anything. For the record I'll add that this is pretty much what I did when creating this patch in the first place. Because it was not done on a svn branch, and it's just a big patch ball you can't see that. Also my goals were a bit different from Thomas's (which I'm fine with in principal). I've had exactly the same thought last evening. Instead of trying to find the bug(s) in the current patch, I'd like to start it over again and do small incremental changes from the current trunk towards the current ZOOKEEPER-823 patch. Maybe I could do this in ZOOKEEPER-823 patch, this would mean to revert the already applied ZOOKEEPER-823 patch. Thomas, did you mean to say do this in ZOOKEEPER-823 *branch*? Then I want to test each incremental step at least 5 times to find the step(s) that breaks ZK. This approach should take me another two weeks, I believe, mostly because each Test run takes ~15-25 minutes. This sounds like a reasonable plan to me if you want to try your hand at it. I also appreciate you stepping up on this effort. Unfortunately only committers can commit to apache SVN. Which means that one of us (ben/f/m/h/myself) will have to apply your change to the branch. You'll have to bug one of us when you're ready to apply a new patch to the branch. If you can create a new patch (rather than changing the original) that would be a good idea (easier for us to apply). Shouldn't be much of an issue I assume if you're using git personally. Notice that I've already setup a hudson job that pulls from the branch. https://hudson.apache.org/hudson/view/ZooKeeper/job/ZooKeeper_branch_823/https://hudson.apache.org/hudson/view/ZooKeeper/job/ZooKeeper_branch_823/43/ Regards, Patrick
Re: What's the QA strategy of ZooKeeper?
Hi Vishal, thanks for the list. As you can see when we do find issues we do our best to address them and increase testing in that area. Unfortunately our testing regime, while extensive is not exhaustive. You can see the clover coverage reports here btw: https://hudson.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/clover/ We'd love to see further contributions around testing. Thomas has opened some discussion around code refactoring, and I'm hopeful that will increase the coverage and enable design for test which we lack in some cases. Patrick On Fri, Oct 15, 2010 at 12:24 PM, Vishal K vishalm...@gmail.com wrote: Hi Patrick, On Fri, Oct 15, 2010 at 2:22 PM, Patrick Hunt ph...@apache.org wrote: Recently, we have ran into issues in ZK that I believe should have caught by some basic testing before the release Vishal, can you be more specific, point out specific JIRAs that you entered would be very valuable. Don't worry about hurting our feelings or anything, without this type of feedback we can't address the specific issues and their underlying problems. Heres a list of few issues: Leader election taking a long time to complete - https://issues.apache.org/jira/browse/ZOOKEEPER-822 Last processed zxid set prematurely while establishing leadership - https://issues.apache.org/jira/browse/ZOOKEEPER-790 FLE implementation should be improved to use non-blocking sockets ZOOKEEPER-900 ZK lets any node to become an observer - https://issues.apache.org/jira/browse/ZOOKEEPER-851 Regards, Patrick On Fri, Oct 15, 2010 at 11:14 AM, Mahadev Konar maha...@yahoo-inc.com wrote: Well said Vishal. I really like the points you put forth!!! Agree on all the points, but again, all the point you mention require commitment from folks like you. Its a pretty hard task to test all the corner cases of ZooKeeper. I'd expect everyone to pitch in for testing a release. We should definitely work towards a plan. You should go ahead and create a jira for the QA plan. We should all pitch in with what all should be tested. Thanks mahadev On 10/15/10 7:32 AM, Vishal K vishalm...@gmail.com wrote: Hi, I would like to add my few cents here. I would suggest to stay away from code cleanup unless it is absolutely necessary. I would also like to extend this discussion to understand the amount of testing/QA to be performed before a release. How do we currently qualify a release? Recently, we have ran into issues in ZK that I believe should have caught by some basic testing before the release. I will be honest in saying that, unfortunately, these bugs have resulted in questions being raised by several people in our organization about our choice of using ZooKeeper. Nevertheless, our product group really thinks that ZK is a cool technology, but we need to focus on making it robust before adding major new features to it. I would suggest to: 1. Look at current bugs and see why existing test did not uncover these bugs and improve those tests. 2. Look at places that need more tests and broadcast it to the community. Follow-up with test development. 3. Have a crisp release QA strategy for each release. 4. Improve API documentation as well as code documentation so that the API usage is clear and debugging is made easier. Comments? Thanks. -Vishal On Fri, Oct 15, 2010 at 9:44 AM, Thomas Koch tho...@koch.ro wrote: Hi Benjamin, thank you for your response. Please find some comments inline. Benjamin Reed: code quality is important, and there are things we should keep in mind, but in general i really don't like the idea of risking code breakage because of a gratuitous code cleanup. we should be watching out for these things when patches get submitted or when new things go in. I didn't want to say it that clear, but especially the new Netty code, both on client and server side is IMHO an example of new code in very bad shape. The client code patch even changes the FindBugs configuration to exclude the new code from the FindBugs checks. i think this is inline with what pat was saying. just to expand a bit. in my opinion clean up refactorings have the following problems: 1) you risk breaking things in production for a potential future maintenance advantage. If your code is already in such a bad shape, that every change includes considerable risk to break something, then you already are in trouble. With every new feature (or bugfix!) you also risk to break something. If you don't have the attitude of permanent refactoring to improve the code quality, you will inevitably lower the maintainability of your code with every new feature. New
New strategy for Netty (ZOOKEEPER-823) was: What's the QA strategy of ZooKeeper?
Benjamin Reed: actually, the other way of doing the netty patch (since i'm scared of merges) would be to do a refactor cleanup patch with an eye toward netty, and then another patch to actually add netty. [...] Hi Benjamin, I've had exactly the same thought last evening. Instead of trying to find the bug(s) in the current patch, I'd like to start it over again and do small incremental changes from the current trunk towards the current ZOOKEEPER-823 patch. Maybe I could do this in ZOOKEEPER-823 patch, this would mean to revert the already applied ZOOKEEPER-823 patch. Then I want to test each incremental step at least 5 times to find the step(s) that breaks ZK. This approach should take me another two weeks, I believe, mostly because each Test run takes ~15-25 minutes. Cheers, Thomas Koch, http://www.koch.ro
Re: What's the QA strategy of ZooKeeper?
Hi Benjamin, thank you for your response. Please find some comments inline. Benjamin Reed: code quality is important, and there are things we should keep in mind, but in general i really don't like the idea of risking code breakage because of a gratuitous code cleanup. we should be watching out for these things when patches get submitted or when new things go in. I didn't want to say it that clear, but especially the new Netty code, both on client and server side is IMHO an example of new code in very bad shape. The client code patch even changes the FindBugs configuration to exclude the new code from the FindBugs checks. i think this is inline with what pat was saying. just to expand a bit. in my opinion clean up refactorings have the following problems: 1) you risk breaking things in production for a potential future maintenance advantage. If your code is already in such a bad shape, that every change includes considerable risk to break something, then you already are in trouble. With every new feature (or bugfix!) you also risk to break something. If you don't have the attitude of permanent refactoring to improve the code quality, you will inevitably lower the maintainability of your code with every new feature. New features will build on the dirty concepts already in the code and therfor make it more expensive to ever clean things up. 2) there is always subjectivity: quality code for one code quality zealot is often seen as a bad design by another code quality zealot. unless there is an objective reason to do it, don't. I don't agree. IMHO, the area of subjectivism in code quality is actually very small compared to hard established standards of quality metrics and best practices. I believe that my list given in the first mail of this thread gives examples of rather objective guidelines. 3) you may cleanup the wrong way. you may restructure to make the current code clean and then end up rewriting and refactoring again to change the logic. Yes. Refactoring isn't easy, but necessary. Only over time you better understand your domain and find better structures. Over time you introduce features that let code grow so that it should better be split up in smaller units that the human brain can still handle. i think we can mitigate 1) by only doing it when necessary. as a corollary we can mitigate 2) and 3) by only doing refactoring/cleanups when motivated by some new change: fix a bug, increased performance, new feature, etc. I agree that refactoring should be carefully planned and done in small steps. Therefor I collected each refactoring item for the java client in a small separate bug in https://issues.apache.org/jira/browse/ZOOKEEPER-835 that can individually be discussed, reviewed and tested. Have a nice weekend after Hadoop World! Thomas Koch, http://www.koch.ro
Re: What's the QA strategy of ZooKeeper?
Hi, I would like to add my few cents here. I would suggest to stay away from code cleanup unless it is absolutely necessary. I would also like to extend this discussion to understand the amount of testing/QA to be performed before a release. How do we currently qualify a release? Recently, we have ran into issues in ZK that I believe should have caught by some basic testing before the release. I will be honest in saying that, unfortunately, these bugs have resulted in questions being raised by several people in our organization about our choice of using ZooKeeper. Nevertheless, our product group really thinks that ZK is a cool technology, but we need to focus on making it robust before adding major new features to it. I would suggest to: 1. Look at current bugs and see why existing test did not uncover these bugs and improve those tests. 2. Look at places that need more tests and broadcast it to the community. Follow-up with test development. 3. Have a crisp release QA strategy for each release. 4. Improve API documentation as well as code documentation so that the API usage is clear and debugging is made easier. Comments? Thanks. -Vishal On Fri, Oct 15, 2010 at 9:44 AM, Thomas Koch tho...@koch.ro wrote: Hi Benjamin, thank you for your response. Please find some comments inline. Benjamin Reed: code quality is important, and there are things we should keep in mind, but in general i really don't like the idea of risking code breakage because of a gratuitous code cleanup. we should be watching out for these things when patches get submitted or when new things go in. I didn't want to say it that clear, but especially the new Netty code, both on client and server side is IMHO an example of new code in very bad shape. The client code patch even changes the FindBugs configuration to exclude the new code from the FindBugs checks. i think this is inline with what pat was saying. just to expand a bit. in my opinion clean up refactorings have the following problems: 1) you risk breaking things in production for a potential future maintenance advantage. If your code is already in such a bad shape, that every change includes considerable risk to break something, then you already are in trouble. With every new feature (or bugfix!) you also risk to break something. If you don't have the attitude of permanent refactoring to improve the code quality, you will inevitably lower the maintainability of your code with every new feature. New features will build on the dirty concepts already in the code and therfor make it more expensive to ever clean things up. 2) there is always subjectivity: quality code for one code quality zealot is often seen as a bad design by another code quality zealot. unless there is an objective reason to do it, don't. I don't agree. IMHO, the area of subjectivism in code quality is actually very small compared to hard established standards of quality metrics and best practices. I believe that my list given in the first mail of this thread gives examples of rather objective guidelines. 3) you may cleanup the wrong way. you may restructure to make the current code clean and then end up rewriting and refactoring again to change the logic. Yes. Refactoring isn't easy, but necessary. Only over time you better understand your domain and find better structures. Over time you introduce features that let code grow so that it should better be split up in smaller units that the human brain can still handle. i think we can mitigate 1) by only doing it when necessary. as a corollary we can mitigate 2) and 3) by only doing refactoring/cleanups when motivated by some new change: fix a bug, increased performance, new feature, etc. I agree that refactoring should be carefully planned and done in small steps. Therefor I collected each refactoring item for the java client in a small separate bug in https://issues.apache.org/jira/browse/ZOOKEEPER-835 that can individually be discussed, reviewed and tested. Have a nice weekend after Hadoop World! Thomas Koch, http://www.koch.ro
Re: What's the QA strategy of ZooKeeper?
Well said Vishal. I really like the points you put forth!!! Agree on all the points, but again, all the point you mention require commitment from folks like you. Its a pretty hard task to test all the corner cases of ZooKeeper. I'd expect everyone to pitch in for testing a release. We should definitely work towards a plan. You should go ahead and create a jira for the QA plan. We should all pitch in with what all should be tested. Thanks mahadev On 10/15/10 7:32 AM, Vishal K vishalm...@gmail.com wrote: Hi, I would like to add my few cents here. I would suggest to stay away from code cleanup unless it is absolutely necessary. I would also like to extend this discussion to understand the amount of testing/QA to be performed before a release. How do we currently qualify a release? Recently, we have ran into issues in ZK that I believe should have caught by some basic testing before the release. I will be honest in saying that, unfortunately, these bugs have resulted in questions being raised by several people in our organization about our choice of using ZooKeeper. Nevertheless, our product group really thinks that ZK is a cool technology, but we need to focus on making it robust before adding major new features to it. I would suggest to: 1. Look at current bugs and see why existing test did not uncover these bugs and improve those tests. 2. Look at places that need more tests and broadcast it to the community. Follow-up with test development. 3. Have a crisp release QA strategy for each release. 4. Improve API documentation as well as code documentation so that the API usage is clear and debugging is made easier. Comments? Thanks. -Vishal On Fri, Oct 15, 2010 at 9:44 AM, Thomas Koch tho...@koch.ro wrote: Hi Benjamin, thank you for your response. Please find some comments inline. Benjamin Reed: code quality is important, and there are things we should keep in mind, but in general i really don't like the idea of risking code breakage because of a gratuitous code cleanup. we should be watching out for these things when patches get submitted or when new things go in. I didn't want to say it that clear, but especially the new Netty code, both on client and server side is IMHO an example of new code in very bad shape. The client code patch even changes the FindBugs configuration to exclude the new code from the FindBugs checks. i think this is inline with what pat was saying. just to expand a bit. in my opinion clean up refactorings have the following problems: 1) you risk breaking things in production for a potential future maintenance advantage. If your code is already in such a bad shape, that every change includes considerable risk to break something, then you already are in trouble. With every new feature (or bugfix!) you also risk to break something. If you don't have the attitude of permanent refactoring to improve the code quality, you will inevitably lower the maintainability of your code with every new feature. New features will build on the dirty concepts already in the code and therfor make it more expensive to ever clean things up. 2) there is always subjectivity: quality code for one code quality zealot is often seen as a bad design by another code quality zealot. unless there is an objective reason to do it, don't. I don't agree. IMHO, the area of subjectivism in code quality is actually very small compared to hard established standards of quality metrics and best practices. I believe that my list given in the first mail of this thread gives examples of rather objective guidelines. 3) you may cleanup the wrong way. you may restructure to make the current code clean and then end up rewriting and refactoring again to change the logic. Yes. Refactoring isn't easy, but necessary. Only over time you better understand your domain and find better structures. Over time you introduce features that let code grow so that it should better be split up in smaller units that the human brain can still handle. i think we can mitigate 1) by only doing it when necessary. as a corollary we can mitigate 2) and 3) by only doing refactoring/cleanups when motivated by some new change: fix a bug, increased performance, new feature, etc. I agree that refactoring should be carefully planned and done in small steps. Therefor I collected each refactoring item for the java client in a small separate bug in https://issues.apache.org/jira/browse/ZOOKEEPER-835 that can individually be discussed, reviewed and tested. Have a nice weekend after Hadoop World! Thomas Koch, http://www.koch.ro
Re: What's the QA strategy of ZooKeeper?
Recently, we have ran into issues in ZK that I believe should have caught by some basic testing before the release Vishal, can you be more specific, point out specific JIRAs that you entered would be very valuable. Don't worry about hurting our feelings or anything, without this type of feedback we can't address the specific issues and their underlying problems. Regards, Patrick On Fri, Oct 15, 2010 at 11:14 AM, Mahadev Konar maha...@yahoo-inc.comwrote: Well said Vishal. I really like the points you put forth!!! Agree on all the points, but again, all the point you mention require commitment from folks like you. Its a pretty hard task to test all the corner cases of ZooKeeper. I'd expect everyone to pitch in for testing a release. We should definitely work towards a plan. You should go ahead and create a jira for the QA plan. We should all pitch in with what all should be tested. Thanks mahadev On 10/15/10 7:32 AM, Vishal K vishalm...@gmail.com wrote: Hi, I would like to add my few cents here. I would suggest to stay away from code cleanup unless it is absolutely necessary. I would also like to extend this discussion to understand the amount of testing/QA to be performed before a release. How do we currently qualify a release? Recently, we have ran into issues in ZK that I believe should have caught by some basic testing before the release. I will be honest in saying that, unfortunately, these bugs have resulted in questions being raised by several people in our organization about our choice of using ZooKeeper. Nevertheless, our product group really thinks that ZK is a cool technology, but we need to focus on making it robust before adding major new features to it. I would suggest to: 1. Look at current bugs and see why existing test did not uncover these bugs and improve those tests. 2. Look at places that need more tests and broadcast it to the community. Follow-up with test development. 3. Have a crisp release QA strategy for each release. 4. Improve API documentation as well as code documentation so that the API usage is clear and debugging is made easier. Comments? Thanks. -Vishal On Fri, Oct 15, 2010 at 9:44 AM, Thomas Koch tho...@koch.ro wrote: Hi Benjamin, thank you for your response. Please find some comments inline. Benjamin Reed: code quality is important, and there are things we should keep in mind, but in general i really don't like the idea of risking code breakage because of a gratuitous code cleanup. we should be watching out for these things when patches get submitted or when new things go in. I didn't want to say it that clear, but especially the new Netty code, both on client and server side is IMHO an example of new code in very bad shape. The client code patch even changes the FindBugs configuration to exclude the new code from the FindBugs checks. i think this is inline with what pat was saying. just to expand a bit. in my opinion clean up refactorings have the following problems: 1) you risk breaking things in production for a potential future maintenance advantage. If your code is already in such a bad shape, that every change includes considerable risk to break something, then you already are in trouble. With every new feature (or bugfix!) you also risk to break something. If you don't have the attitude of permanent refactoring to improve the code quality, you will inevitably lower the maintainability of your code with every new feature. New features will build on the dirty concepts already in the code and therfor make it more expensive to ever clean things up. 2) there is always subjectivity: quality code for one code quality zealot is often seen as a bad design by another code quality zealot. unless there is an objective reason to do it, don't. I don't agree. IMHO, the area of subjectivism in code quality is actually very small compared to hard established standards of quality metrics and best practices. I believe that my list given in the first mail of this thread gives examples of rather objective guidelines. 3) you may cleanup the wrong way. you may restructure to make the current code clean and then end up rewriting and refactoring again to change the logic. Yes. Refactoring isn't easy, but necessary. Only over time you better understand your domain and find better structures. Over time you introduce features that let code grow so that it should better be split up in smaller units that the human brain can still handle. i think we can mitigate 1) by only doing it when necessary. as a corollary we can mitigate 2) and 3) by only doing refactoring/cleanups when motivated by some new change: fix a bug, increased performance, new feature, etc. I agree that refactoring should be carefully planned and done in small steps. Therefor I collected each refactoring
Re: What's the QA strategy of ZooKeeper?
i think we have a very different perspective on the quality issue: I didn't want to say it that clear, but especially the new Netty code, both on client and server side is IMHO an example of new code in very bad shape. The client code patch even changes the FindBugs configuration to exclude the new code from the FindBugs checks. great. fixing the code and refactoring before a patch goes in is the perfect time to do it! please give feedback and help make the patch better. there is a reason to exclude checks (which is why there is such excludes), but if we can avoid them we should. before a patch is applied is exactly the time to do cleanup If your code is already in such a bad shape, that every change includes considerable risk to break something, then you already are in trouble. With every new feature (or bugfix!) you also risk to break something. If you don't have the attitude of permanent refactoring to improve the code quality, you will inevitably lower the maintainability of your code with every new feature. New features will build on the dirty concepts already in the code and therfor make it more expensive to ever clean things up. cleaning up code to add a new feature is a great time to clean up the code. Yes. Refactoring isn't easy, but necessary. Only over time you better understand your domain and find better structures. Over time you introduce features that let code grow so that it should better be split up in smaller units that the human brain can still handle. it is the but necessary that i disagree with. there is plenty of code that could be cleaned up and made to look a lot nicer, but we shouldn't touch it, unless we are fixing something else or adding a new feature. it's pretty lame to explain to someone that the bug that was introduced by a code change was motivated by a desire to make the code cleaner. any code change runs the risk of breakage, thus changing code simply for cleanliness is not worth the risk. ben
Re: What's the QA strategy of ZooKeeper?
Hi Mahadev, Yes this will require effort from the community. In a related note how many of the contributors do you think are dedicated (either partially or fully to ZK development/testing)? This will help to in understanding how much effort will be required for tasks and how we can prioritize them based on resources. I will create a JIRA for this with some initial thoughts. However, the list would have to come from folks who are currently familiar with parts of ZK code. Thanks. -Vishal On Fri, Oct 15, 2010 at 2:14 PM, Mahadev Konar maha...@yahoo-inc.comwrote: Well said Vishal. I really like the points you put forth!!! Agree on all the points, but again, all the point you mention require commitment from folks like you. Its a pretty hard task to test all the corner cases of ZooKeeper. I'd expect everyone to pitch in for testing a release. We should definitely work towards a plan. You should go ahead and create a jira for the QA plan. We should all pitch in with what all should be tested. Thanks mahadev On 10/15/10 7:32 AM, Vishal K vishalm...@gmail.com wrote: Hi, I would like to add my few cents here. I would suggest to stay away from code cleanup unless it is absolutely necessary. I would also like to extend this discussion to understand the amount of testing/QA to be performed before a release. How do we currently qualify a release? Recently, we have ran into issues in ZK that I believe should have caught by some basic testing before the release. I will be honest in saying that, unfortunately, these bugs have resulted in questions being raised by several people in our organization about our choice of using ZooKeeper. Nevertheless, our product group really thinks that ZK is a cool technology, but we need to focus on making it robust before adding major new features to it. I would suggest to: 1. Look at current bugs and see why existing test did not uncover these bugs and improve those tests. 2. Look at places that need more tests and broadcast it to the community. Follow-up with test development. 3. Have a crisp release QA strategy for each release. 4. Improve API documentation as well as code documentation so that the API usage is clear and debugging is made easier. Comments? Thanks. -Vishal On Fri, Oct 15, 2010 at 9:44 AM, Thomas Koch tho...@koch.ro wrote: Hi Benjamin, thank you for your response. Please find some comments inline. Benjamin Reed: code quality is important, and there are things we should keep in mind, but in general i really don't like the idea of risking code breakage because of a gratuitous code cleanup. we should be watching out for these things when patches get submitted or when new things go in. I didn't want to say it that clear, but especially the new Netty code, both on client and server side is IMHO an example of new code in very bad shape. The client code patch even changes the FindBugs configuration to exclude the new code from the FindBugs checks. i think this is inline with what pat was saying. just to expand a bit. in my opinion clean up refactorings have the following problems: 1) you risk breaking things in production for a potential future maintenance advantage. If your code is already in such a bad shape, that every change includes considerable risk to break something, then you already are in trouble. With every new feature (or bugfix!) you also risk to break something. If you don't have the attitude of permanent refactoring to improve the code quality, you will inevitably lower the maintainability of your code with every new feature. New features will build on the dirty concepts already in the code and therfor make it more expensive to ever clean things up. 2) there is always subjectivity: quality code for one code quality zealot is often seen as a bad design by another code quality zealot. unless there is an objective reason to do it, don't. I don't agree. IMHO, the area of subjectivism in code quality is actually very small compared to hard established standards of quality metrics and best practices. I believe that my list given in the first mail of this thread gives examples of rather objective guidelines. 3) you may cleanup the wrong way. you may restructure to make the current code clean and then end up rewriting and refactoring again to change the logic. Yes. Refactoring isn't easy, but necessary. Only over time you better understand your domain and find better structures. Over time you introduce features that let code grow so that it should better be split up in smaller units that the human brain can still handle. i think we can mitigate 1) by only doing it when necessary. as a corollary we can mitigate 2) and 3) by only doing refactoring/cleanups when motivated by some new change: fix a bug, increased performance, new feature, etc. I agree that
Re: What's the QA strategy of ZooKeeper?
Hi Patrick, On Fri, Oct 15, 2010 at 2:22 PM, Patrick Hunt ph...@apache.org wrote: Recently, we have ran into issues in ZK that I believe should have caught by some basic testing before the release Vishal, can you be more specific, point out specific JIRAs that you entered would be very valuable. Don't worry about hurting our feelings or anything, without this type of feedback we can't address the specific issues and their underlying problems. Heres a list of few issues: Leader election taking a long time to complete - https://issues.apache.org/jira/browse/ZOOKEEPER-822 Last processed zxid set prematurely while establishing leadership - https://issues.apache.org/jira/browse/ZOOKEEPER-790 FLE implementation should be improved to use non-blocking sockets ZOOKEEPER-900 ZK lets any node to become an observer - https://issues.apache.org/jira/browse/ZOOKEEPER-851 Regards, Patrick On Fri, Oct 15, 2010 at 11:14 AM, Mahadev Konar maha...@yahoo-inc.com wrote: Well said Vishal. I really like the points you put forth!!! Agree on all the points, but again, all the point you mention require commitment from folks like you. Its a pretty hard task to test all the corner cases of ZooKeeper. I'd expect everyone to pitch in for testing a release. We should definitely work towards a plan. You should go ahead and create a jira for the QA plan. We should all pitch in with what all should be tested. Thanks mahadev On 10/15/10 7:32 AM, Vishal K vishalm...@gmail.com wrote: Hi, I would like to add my few cents here. I would suggest to stay away from code cleanup unless it is absolutely necessary. I would also like to extend this discussion to understand the amount of testing/QA to be performed before a release. How do we currently qualify a release? Recently, we have ran into issues in ZK that I believe should have caught by some basic testing before the release. I will be honest in saying that, unfortunately, these bugs have resulted in questions being raised by several people in our organization about our choice of using ZooKeeper. Nevertheless, our product group really thinks that ZK is a cool technology, but we need to focus on making it robust before adding major new features to it. I would suggest to: 1. Look at current bugs and see why existing test did not uncover these bugs and improve those tests. 2. Look at places that need more tests and broadcast it to the community. Follow-up with test development. 3. Have a crisp release QA strategy for each release. 4. Improve API documentation as well as code documentation so that the API usage is clear and debugging is made easier. Comments? Thanks. -Vishal On Fri, Oct 15, 2010 at 9:44 AM, Thomas Koch tho...@koch.ro wrote: Hi Benjamin, thank you for your response. Please find some comments inline. Benjamin Reed: code quality is important, and there are things we should keep in mind, but in general i really don't like the idea of risking code breakage because of a gratuitous code cleanup. we should be watching out for these things when patches get submitted or when new things go in. I didn't want to say it that clear, but especially the new Netty code, both on client and server side is IMHO an example of new code in very bad shape. The client code patch even changes the FindBugs configuration to exclude the new code from the FindBugs checks. i think this is inline with what pat was saying. just to expand a bit. in my opinion clean up refactorings have the following problems: 1) you risk breaking things in production for a potential future maintenance advantage. If your code is already in such a bad shape, that every change includes considerable risk to break something, then you already are in trouble. With every new feature (or bugfix!) you also risk to break something. If you don't have the attitude of permanent refactoring to improve the code quality, you will inevitably lower the maintainability of your code with every new feature. New features will build on the dirty concepts already in the code and therfor make it more expensive to ever clean things up. 2) there is always subjectivity: quality code for one code quality zealot is often seen as a bad design by another code quality zealot. unless there is an objective reason to do it, don't. I don't agree. IMHO, the area of subjectivism in code quality is actually very small compared to hard established standards of quality metrics and best practices. I believe that my list given in the first mail of this thread gives examples of rather objective guidelines. 3) you may cleanup the wrong way. you may restructure to make the current code clean and then end up rewriting and refactoring again to change
Re: What's the QA strategy of ZooKeeper?
I broadly agree with Ben - all meaningful code changes carry a risk of destabilization (otherwise software development would be very easy) so we should guard against improving cleanliness only for its own sake. At the point where bad code gets in the way of fixing bugs or adding features, I think it's very worthwhile to 'lazily' clean code. I did this with the observers patch - reworked some of the class hierarchies to improve encapsulation and make it easier to add new implementations. The netty patch is a good test case for this approach. If we feel that reworking the structure of the existing server cnxn code will make it significantly easier to add a second implementation that adheres to the same interface, then I say that such a refactoring is worthwhile, but even then only if it's straightforward to make the changes while convincing ourselves that the behaviour of the new implementation is consistent with the old. Thomas, do comment on the patch itself! That's the very best way to make sure your concerns get heard and addressed. cheers, Henry On 15 October 2010 11:37, Benjamin Reed br...@yahoo-inc.com wrote: i think we have a very different perspective on the quality issue: I didn't want to say it that clear, but especially the new Netty code, both on client and server side is IMHO an example of new code in very bad shape. The client code patch even changes the FindBugs configuration to exclude the new code from the FindBugs checks. great. fixing the code and refactoring before a patch goes in is the perfect time to do it! please give feedback and help make the patch better. there is a reason to exclude checks (which is why there is such excludes), but if we can avoid them we should. before a patch is applied is exactly the time to do cleanup If your code is already in such a bad shape, that every change includes considerable risk to break something, then you already are in trouble. With every new feature (or bugfix!) you also risk to break something. If you don't have the attitude of permanent refactoring to improve the code quality, you will inevitably lower the maintainability of your code with every new feature. New features will build on the dirty concepts already in the code and therfor make it more expensive to ever clean things up. cleaning up code to add a new feature is a great time to clean up the code. Yes. Refactoring isn't easy, but necessary. Only over time you better understand your domain and find better structures. Over time you introduce features that let code grow so that it should better be split up in smaller units that the human brain can still handle. it is the but necessary that i disagree with. there is plenty of code that could be cleaned up and made to look a lot nicer, but we shouldn't touch it, unless we are fixing something else or adding a new feature. it's pretty lame to explain to someone that the bug that was introduced by a code change was motivated by a desire to make the code cleaner. any code change runs the risk of breakage, thus changing code simply for cleanliness is not worth the risk. ben -- Henry Robinson Software Engineer Cloudera 415-994-6679
Re: What's the QA strategy of ZooKeeper?
On Fri, Oct 15, 2010 at 12:11 PM, Henry Robinson he...@cloudera.com wrote: The netty patch is a good test case for this approach. If we feel that reworking the structure of the existing server cnxn code will make it significantly easier to add a second implementation that adheres to the same interface, then I say that such a refactoring is worthwhile, but even then only if it's straightforward to make the changes while convincing ourselves that the behaviour of the new implementation is consistent with the old. Thomas, do comment on the patch itself! That's the very best way to make sure your concerns get heard and addressed. Well really the _best_ way IMO is to both comment and submit a patch. ;-) And this is just what Thomas is doing, so kudos to him for the effort! Vishal is doing this as well for many of the issues he's found, so thanks to him also. We do appreciate you guys jumping in to help. Lack of contributors is one of the things we've been missing and addressing that opens the door to some of these improvements being suggested. Wrt the netty patch, the approach Ben and I took was to refactor sufficiently to add support for NIO/Netty/... while minimizing breakage. This is already a big patch, esp given that the code is not really as clean to begin with (complex too). Perfect situation, no. But the intent was to further clean things up once the original patch was reviewed/committed. Trying to do a huge refactoring in one shot (one patch) is not a good idea imo. Already these patches are too large. Perhaps lesson learned here is that we should have just created a special branch from the get go, applied a number of smaller patches to that branch, then eventually merged back into the trunk once it was fully baked. Patrick
Re: What's the QA strategy of ZooKeeper?
On Fri, Oct 15, 2010 at 12:11 PM, Henry Robinson he...@cloudera.com wrote: The netty patch is a good test case for this approach. If we feel that reworking the structure of the existing server cnxn code will make it significantly easier to add a second implementation that adheres to the same interface, then I say that such a refactoring is worthwhile, but even then only if it's straightforward to make the changes while convincing ourselves that the behaviour of the new implementation is consistent with the old. Thomas, do comment on the patch itself! That's the very best way to make sure your concerns get heard and addressed. Well really the _best_ way IMO is to both comment and submit a patch. ;-) And this is just what Thomas is doing, so kudos to him for the effort! Vishal is doing this as well for many of the issues he's found, so thanks to him also. We do appreciate you guys jumping in to help. Lack of contributors is one of the things we've been missing and addressing that opens the door to some of these improvements being suggested. Wrt the netty patch, the approach Ben and I took was to refactor sufficiently to add support for NIO/Netty/... while minimizing breakage. This is already a big patch, esp given that the code is not really as clean to begin with (complex too). Perfect situation, no. But the intent was to further clean things up once the original patch was reviewed/committed. Trying to do a huge refactoring in one shot (one patch) is not a good idea imo. Already these patches are too large. Perhaps lesson learned here is that we should have just created a special branch from the get go, applied a number of smaller patches to that branch, then eventually merged back into the trunk once it was fully baked. Patrick
Re: What's the QA strategy of ZooKeeper?
actually, the other way of doing the netty patch (since i'm scared of merges) would be to do a refactor cleanup patch with an eye toward netty, and then another patch to actually add netty. that would have been nice because the first patch would allow us to more easily make sure that NIO wasn't broken. and the second we could focus more on the netty addition. ben On 10/15/2010 03:07 PM, Patrick Hunt wrote: On Fri, Oct 15, 2010 at 12:11 PM, Henry Robinsonhe...@cloudera.com wrote: The netty patch is a good test case for this approach. If we feel that reworking the structure of the existing server cnxn code will make it significantly easier to add a second implementation that adheres to the same interface, then I say that such a refactoring is worthwhile, but even then only if it's straightforward to make the changes while convincing ourselves that the behaviour of the new implementation is consistent with the old. Thomas, do comment on the patch itself! That's the very best way to make sure your concerns get heard and addressed. Well really the _best_ way IMO is to both comment and submit a patch. ;-) And this is just what Thomas is doing, so kudos to him for the effort! Vishal is doing this as well for many of the issues he's found, so thanks to him also. We do appreciate you guys jumping in to help. Lack of contributors is one of the things we've been missing and addressing that opens the door to some of these improvements being suggested. Wrt the netty patch, the approach Ben and I took was to refactor sufficiently to add support for NIO/Netty/... while minimizing breakage. This is already a big patch, esp given that the code is not really as clean to begin with (complex too). Perfect situation, no. But the intent was to further clean things up once the original patch was reviewed/committed. Trying to do a huge refactoring in one shot (one patch) is not a good idea imo. Already these patches are too large. Perhaps lesson learned here is that we should have just created a special branch from the get go, applied a number of smaller patches to that branch, then eventually merged back into the trunk once it was fully baked. Patrick
Re: What's the QA strategy of ZooKeeper?
code quality is important, and there are things we should keep in mind, but in general i really don't like the idea of risking code breakage because of a gratuitous code cleanup. we should be watching out for these things when patches get submitted or when new things go in. i think this is inline with what pat was saying. just to expand a bit. in my opinion clean up refactorings have the following problems: 1) you risk breaking things in production for a potential future maintenance advantage. 2) there is always subjectivity: quality code for one code quality zealot is often seen as a bad design by another code quality zealot. unless there is an objective reason to do it, don't. 3) you may cleanup the wrong way. you may restructure to make the current code clean and then end up rewriting and refactoring again to change the logic. i think we can mitigate 1) by only doing it when necessary. as a corollary we can mitigate 2) and 3) by only doing refactoring/cleanups when motivated by some new change: fix a bug, increased performance, new feature, etc. ben On 10/13/2010 06:18 AM, Thomas Koch wrote: Hi, after filling 13 refactoring issues against the Java Client code[1], I started to dig into the server site code to understand the last issues with the Netty stuff. I feel bad. It's this feeling of I don't wanna hurt you, but ZooKeeper is quite an important piece of the Hadoop ecosystem containing some of the most complicated pieces of code. And it'll only get more complex with more features. I'd propose to have a word about quality assurance. Is there already a strategy to ensure the ongoing maintainability of ZK? Is there a code style guide, a list of Dos-And-Donts (where I'd like to add some points)? Should PMD be added to Hudson? What is the level of FindBugs? Should it be raised? Some of the points I'd like to add to a style guide: - Don't write methods longer then 20-40 lines of code - Are you sure you want to use inner classes? - If there is a new operator in a method? Could the method maybe already receive the object as a parameter? - Are you sure you want to use system properties? They are like global variables and the IDE does not know about them - Are you sure you want to extend a class? Often an aggregation is more elegant. - Don't nest ifs and loops deeper then 2 or 3 levels. If you do so, you should better break your code into more methods. - Use Enums or constants instead of plain status integers - please document your intentions in code comments. You don't need to comment the what? but the why?. Do you agree with me, that there is a need for better code quality in ZooKeeper? If so, it's not really scalable if a manic like me fights like Don Quichotte to clean up the code. All developers would need to establish a sense for clean code and constantly improve the code. [1] https://issues.apache.org/jira/browse/ZOOKEEPER-835 Best regards, Thomas Koch, http://www.koch.ro
What's the QA strategy of ZooKeeper?
Hi, after filling 13 refactoring issues against the Java Client code[1], I started to dig into the server site code to understand the last issues with the Netty stuff. I feel bad. It's this feeling of I don't wanna hurt you, but ZooKeeper is quite an important piece of the Hadoop ecosystem containing some of the most complicated pieces of code. And it'll only get more complex with more features. I'd propose to have a word about quality assurance. Is there already a strategy to ensure the ongoing maintainability of ZK? Is there a code style guide, a list of Dos-And-Donts (where I'd like to add some points)? Should PMD be added to Hudson? What is the level of FindBugs? Should it be raised? Some of the points I'd like to add to a style guide: - Don't write methods longer then 20-40 lines of code - Are you sure you want to use inner classes? - If there is a new operator in a method? Could the method maybe already receive the object as a parameter? - Are you sure you want to use system properties? They are like global variables and the IDE does not know about them - Are you sure you want to extend a class? Often an aggregation is more elegant. - Don't nest ifs and loops deeper then 2 or 3 levels. If you do so, you should better break your code into more methods. - Use Enums or constants instead of plain status integers - please document your intentions in code comments. You don't need to comment the what? but the why?. Do you agree with me, that there is a need for better code quality in ZooKeeper? If so, it's not really scalable if a manic like me fights like Don Quichotte to clean up the code. All developers would need to establish a sense for clean code and constantly improve the code. [1] https://issues.apache.org/jira/browse/ZOOKEEPER-835 Best regards, Thomas Koch, http://www.koch.ro