Re: [DISCUSS] Issues with heterogeneity of the code
Massive +1 for switching to space indention. Makes the code render consistently across various viewers (e.g. Github UI, Apache infrastructure, IDEs). 2015-03-18 1:29 GMT+01:00 Fabian Hueske fhue...@gmail.com: Touching every file of the code would also be a good opportunity to switch from tab to space indention. So if we enforce a strict style, we could also address this issue which causes discussions every now and then. 2015-03-16 21:53 GMT+01:00 Aljoscha Krettek aljos...@apache.org: No, but I don't know whether that's possible. The style guide prescribes, for example, this: def foo( a: Int, b: String, c: String) for methods with long parameter lists while a lot of people do this: def foo(a: Int, b: String, c: String) (IntelliJ also does this). The scalastyle rules I added supposedly check for the official scala guide style but they allow both styles of methods. On Mon, Mar 16, 2015 at 5:37 PM, Till Rohrmann trohrm...@apache.org wrote: Do we already enforce the official Scala style guide strictly? On Mon, Mar 16, 2015 at 4:57 PM, Aljoscha Krettek aljos...@apache.org wrote: I'm already always sticking to the official Scala style guide, with the exception of 100 line length. On Mar 16, 2015 3:27 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for stricter Java code styles. I haven't looked into the Google Code Style but maybe we make it easier for new contributors if we apply a coding style which is somehow known. +1 for line length of 100 for Scala code. I think it makes code review on GitHub easier. For the Scala style, we could stick to official style guidelines [1]. [1] http://docs.scala-lang.org/style/ On Mon, Mar 16, 2015 at 3:06 PM, Hermann Gábor reckone...@gmail.com wrote: +1 for the stricter Java code styles. We should not forget about providing code formatter settings for Eclipse and Intellij IDEA (as mentioned above). That would help a lot. (Of course if we'll use Google Code Style, they already provide such files https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml .) On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: +1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org: +1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org wrote: +1 for enforcing a more strict Java code style. However, let's not introduce a line legth of 100 like in Scala. I think that's hurting readability of the code. On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote: On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org wrote: I'm in favor of strict coding styles. And I like the google style. +1 I would like that. We essentially all agree that we want more homogeneity and I think strict rules are the only way to go. Since this is a very subjective matter it makes sense to go with something (somewhat) well established like the Google code style.
Re: [DISCUSS] Issues with heterogeneity of the code
I'm against changing the indentation, for the same reasons as Stephan listed. In my opinion, the codebase has grown too large to just switch the indentation or the entire code style (to the google style or whatever). We have 235870 LOC of Java and 24173 LOC of Scala. Therefore, I'm proposing to: a) Decide on one validation framework (Guava seems to be our favorite here), change document it. b) Decide on one parsing framework, change document it. c) Decide on a list of checkstyle rules. d) Add pull requests for each rule, adding it to the checkstyle configuration and making the code compliant. Regarding the rules themselves: I have the impression that there is an unwritten agreement on the code style by the core committers of Flink. There might be some minor differences nobody cares about .. but all in all the majority of the code is pretty uniform (at least how I perceive it) The list of checkstyle rules should basically enforce this core committers code style because it would minimize the number of required changes. Also, the style has evolved over many years and seems to work well for our needs. On Wed, Mar 18, 2015 at 4:25 PM, Stephan Ewen se...@apache.org wrote: I agree, if we set p a new project, we should use space indentation. Should we really refactor 300k lines of code? Would be massive. Also: The history would basically show a single committer for all code. Git blame (for error tracing) would become useless. On Wed, Mar 18, 2015 at 3:49 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Massive +1 for switching to space indention. Makes the code render consistently across various viewers (e.g. Github UI, Apache infrastructure, IDEs). 2015-03-18 1:29 GMT+01:00 Fabian Hueske fhue...@gmail.com: Touching every file of the code would also be a good opportunity to switch from tab to space indention. So if we enforce a strict style, we could also address this issue which causes discussions every now and then. 2015-03-16 21:53 GMT+01:00 Aljoscha Krettek aljos...@apache.org: No, but I don't know whether that's possible. The style guide prescribes, for example, this: def foo( a: Int, b: String, c: String) for methods with long parameter lists while a lot of people do this: def foo(a: Int, b: String, c: String) (IntelliJ also does this). The scalastyle rules I added supposedly check for the official scala guide style but they allow both styles of methods. On Mon, Mar 16, 2015 at 5:37 PM, Till Rohrmann trohrm...@apache.org wrote: Do we already enforce the official Scala style guide strictly? On Mon, Mar 16, 2015 at 4:57 PM, Aljoscha Krettek aljos...@apache.org wrote: I'm already always sticking to the official Scala style guide, with the exception of 100 line length. On Mar 16, 2015 3:27 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for stricter Java code styles. I haven't looked into the Google Code Style but maybe we make it easier for new contributors if we apply a coding style which is somehow known. +1 for line length of 100 for Scala code. I think it makes code review on GitHub easier. For the Scala style, we could stick to official style guidelines [1]. [1] http://docs.scala-lang.org/style/ On Mon, Mar 16, 2015 at 3:06 PM, Hermann Gábor reckone...@gmail.com wrote: +1 for the stricter Java code styles. We should not forget about providing code formatter settings for Eclipse and Intellij IDEA (as mentioned above). That would help a lot. (Of course if we'll use Google Code Style, they already provide such files https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml .) On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: +1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org: +1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org wrote: +1 for enforcing a more strict Java code style. However, let's not introduce a line legth of 100 like in Scala. I think that's hurting readability of the code. On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote: On Saturday, March 14, 2015,
Re: [DISCUSS] Issues with heterogeneity of the code
I agree, if we set p a new project, we should use space indentation. Should we really refactor 300k lines of code? Would be massive. Also: The history would basically show a single committer for all code. Git blame (for error tracing) would become useless. On Wed, Mar 18, 2015 at 3:49 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Massive +1 for switching to space indention. Makes the code render consistently across various viewers (e.g. Github UI, Apache infrastructure, IDEs). 2015-03-18 1:29 GMT+01:00 Fabian Hueske fhue...@gmail.com: Touching every file of the code would also be a good opportunity to switch from tab to space indention. So if we enforce a strict style, we could also address this issue which causes discussions every now and then. 2015-03-16 21:53 GMT+01:00 Aljoscha Krettek aljos...@apache.org: No, but I don't know whether that's possible. The style guide prescribes, for example, this: def foo( a: Int, b: String, c: String) for methods with long parameter lists while a lot of people do this: def foo(a: Int, b: String, c: String) (IntelliJ also does this). The scalastyle rules I added supposedly check for the official scala guide style but they allow both styles of methods. On Mon, Mar 16, 2015 at 5:37 PM, Till Rohrmann trohrm...@apache.org wrote: Do we already enforce the official Scala style guide strictly? On Mon, Mar 16, 2015 at 4:57 PM, Aljoscha Krettek aljos...@apache.org wrote: I'm already always sticking to the official Scala style guide, with the exception of 100 line length. On Mar 16, 2015 3:27 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for stricter Java code styles. I haven't looked into the Google Code Style but maybe we make it easier for new contributors if we apply a coding style which is somehow known. +1 for line length of 100 for Scala code. I think it makes code review on GitHub easier. For the Scala style, we could stick to official style guidelines [1]. [1] http://docs.scala-lang.org/style/ On Mon, Mar 16, 2015 at 3:06 PM, Hermann Gábor reckone...@gmail.com wrote: +1 for the stricter Java code styles. We should not forget about providing code formatter settings for Eclipse and Intellij IDEA (as mentioned above). That would help a lot. (Of course if we'll use Google Code Style, they already provide such files https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml .) On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: +1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org: +1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org wrote: +1 for enforcing a more strict Java code style. However, let's not introduce a line legth of 100 like in Scala. I think that's hurting readability of the code. On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote: On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org wrote: I'm in favor of strict coding styles. And I like the google style. +1 I would like that. We essentially all agree that we want more homogeneity and I think strict rules are the only way to go. Since this is a very subjective matter it makes sense to go with something (somewhat) well established like the Google code style.
Re: [DISCUSS] Issues with heterogeneity of the code
+1 for keeping the tabs vs space for now. When I start working on Flink I was itching to move to spaces indentation since that makes reader consistent. But that was the choice by preference made early on and I think we could keep it and add as exception in Java code style. Scala, on the other hand is different. We should stick with Scala standard code style and add exceptions as needed. For those don't pay attention, there has been small backlash when Databricks announce it's Scala coding guide which not promoting the recommended way of FP use for Scala. I would like to have written agreement on style, meaning stricter style check for Java and Scala. The idea is to reduce heterogeneity of the code for easy to follow and understand. Right now the existing committers have good agreement of code style but like I have said before, the bigger the community the harder to enforce this style (e.g: Hadoop) - Henry On Wed, Mar 18, 2015 at 8:56 AM, Robert Metzger rmetz...@apache.org wrote: I'm against changing the indentation, for the same reasons as Stephan listed. In my opinion, the codebase has grown too large to just switch the indentation or the entire code style (to the google style or whatever). We have 235870 LOC of Java and 24173 LOC of Scala. Therefore, I'm proposing to: a) Decide on one validation framework (Guava seems to be our favorite here), change document it. b) Decide on one parsing framework, change document it. c) Decide on a list of checkstyle rules. d) Add pull requests for each rule, adding it to the checkstyle configuration and making the code compliant. Regarding the rules themselves: I have the impression that there is an unwritten agreement on the code style by the core committers of Flink. There might be some minor differences nobody cares about .. but all in all the majority of the code is pretty uniform (at least how I perceive it) The list of checkstyle rules should basically enforce this core committers code style because it would minimize the number of required changes. Also, the style has evolved over many years and seems to work well for our needs. On Wed, Mar 18, 2015 at 4:25 PM, Stephan Ewen se...@apache.org wrote: I agree, if we set p a new project, we should use space indentation. Should we really refactor 300k lines of code? Would be massive. Also: The history would basically show a single committer for all code. Git blame (for error tracing) would become useless. On Wed, Mar 18, 2015 at 3:49 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Massive +1 for switching to space indention. Makes the code render consistently across various viewers (e.g. Github UI, Apache infrastructure, IDEs). 2015-03-18 1:29 GMT+01:00 Fabian Hueske fhue...@gmail.com: Touching every file of the code would also be a good opportunity to switch from tab to space indention. So if we enforce a strict style, we could also address this issue which causes discussions every now and then. 2015-03-16 21:53 GMT+01:00 Aljoscha Krettek aljos...@apache.org: No, but I don't know whether that's possible. The style guide prescribes, for example, this: def foo( a: Int, b: String, c: String) for methods with long parameter lists while a lot of people do this: def foo(a: Int, b: String, c: String) (IntelliJ also does this). The scalastyle rules I added supposedly check for the official scala guide style but they allow both styles of methods. On Mon, Mar 16, 2015 at 5:37 PM, Till Rohrmann trohrm...@apache.org wrote: Do we already enforce the official Scala style guide strictly? On Mon, Mar 16, 2015 at 4:57 PM, Aljoscha Krettek aljos...@apache.org wrote: I'm already always sticking to the official Scala style guide, with the exception of 100 line length. On Mar 16, 2015 3:27 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for stricter Java code styles. I haven't looked into the Google Code Style but maybe we make it easier for new contributors if we apply a coding style which is somehow known. +1 for line length of 100 for Scala code. I think it makes code review on GitHub easier. For the Scala style, we could stick to official style guidelines [1]. [1] http://docs.scala-lang.org/style/ On Mon, Mar 16, 2015 at 3:06 PM, Hermann Gábor reckone...@gmail.com wrote: +1 for the stricter Java code styles. We should not forget about providing code formatter settings for Eclipse and Intellij IDEA (as mentioned above). That would help a lot. (Of course if we'll use Google Code Style, they already provide such files
Re: [DISCUSS] Issues with heterogeneity of the code
Touching every file of the code would also be a good opportunity to switch from tab to space indention. So if we enforce a strict style, we could also address this issue which causes discussions every now and then. 2015-03-16 21:53 GMT+01:00 Aljoscha Krettek aljos...@apache.org: No, but I don't know whether that's possible. The style guide prescribes, for example, this: def foo( a: Int, b: String, c: String) for methods with long parameter lists while a lot of people do this: def foo(a: Int, b: String, c: String) (IntelliJ also does this). The scalastyle rules I added supposedly check for the official scala guide style but they allow both styles of methods. On Mon, Mar 16, 2015 at 5:37 PM, Till Rohrmann trohrm...@apache.org wrote: Do we already enforce the official Scala style guide strictly? On Mon, Mar 16, 2015 at 4:57 PM, Aljoscha Krettek aljos...@apache.org wrote: I'm already always sticking to the official Scala style guide, with the exception of 100 line length. On Mar 16, 2015 3:27 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for stricter Java code styles. I haven't looked into the Google Code Style but maybe we make it easier for new contributors if we apply a coding style which is somehow known. +1 for line length of 100 for Scala code. I think it makes code review on GitHub easier. For the Scala style, we could stick to official style guidelines [1]. [1] http://docs.scala-lang.org/style/ On Mon, Mar 16, 2015 at 3:06 PM, Hermann Gábor reckone...@gmail.com wrote: +1 for the stricter Java code styles. We should not forget about providing code formatter settings for Eclipse and Intellij IDEA (as mentioned above). That would help a lot. (Of course if we'll use Google Code Style, they already provide such files https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml .) On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: +1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org: +1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org wrote: +1 for enforcing a more strict Java code style. However, let's not introduce a line legth of 100 like in Scala. I think that's hurting readability of the code. On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote: On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org wrote: I'm in favor of strict coding styles. And I like the google style. +1 I would like that. We essentially all agree that we want more homogeneity and I think strict rules are the only way to go. Since this is a very subjective matter it makes sense to go with something (somewhat) well established like the Google code style.
Re: [DISCUSS] Issues with heterogeneity of the code
+1 for enforcing a more strict Java code style. However, let's not introduce a line legth of 100 like in Scala. I think that's hurting readability of the code. On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote: On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org wrote: I'm in favor of strict coding styles. And I like the google style. +1 I would like that. We essentially all agree that we want more homogeneity and I think strict rules are the only way to go. Since this is a very subjective matter it makes sense to go with something (somewhat) well established like the Google code style.
Re: [DISCUSS] Issues with heterogeneity of the code
+1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org: +1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org wrote: +1 for enforcing a more strict Java code style. However, let's not introduce a line legth of 100 like in Scala. I think that's hurting readability of the code. On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote: On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org wrote: I'm in favor of strict coding styles. And I like the google style. +1 I would like that. We essentially all agree that we want more homogeneity and I think strict rules are the only way to go. Since this is a very subjective matter it makes sense to go with something (somewhat) well established like the Google code style.
Re: [DISCUSS] Issues with heterogeneity of the code
+1 for the stricter Java code styles. We should not forget about providing code formatter settings for Eclipse and Intellij IDEA (as mentioned above). That would help a lot. (Of course if we'll use Google Code Style, they already provide such files https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml .) On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: +1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org: +1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org wrote: +1 for enforcing a more strict Java code style. However, let's not introduce a line legth of 100 like in Scala. I think that's hurting readability of the code. On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote: On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org wrote: I'm in favor of strict coding styles. And I like the google style. +1 I would like that. We essentially all agree that we want more homogeneity and I think strict rules are the only way to go. Since this is a very subjective matter it makes sense to go with something (somewhat) well established like the Google code style.
Re: [DISCUSS] Issues with heterogeneity of the code
Do we already enforce the official Scala style guide strictly? On Mon, Mar 16, 2015 at 4:57 PM, Aljoscha Krettek aljos...@apache.org wrote: I'm already always sticking to the official Scala style guide, with the exception of 100 line length. On Mar 16, 2015 3:27 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for stricter Java code styles. I haven't looked into the Google Code Style but maybe we make it easier for new contributors if we apply a coding style which is somehow known. +1 for line length of 100 for Scala code. I think it makes code review on GitHub easier. For the Scala style, we could stick to official style guidelines [1]. [1] http://docs.scala-lang.org/style/ On Mon, Mar 16, 2015 at 3:06 PM, Hermann Gábor reckone...@gmail.com wrote: +1 for the stricter Java code styles. We should not forget about providing code formatter settings for Eclipse and Intellij IDEA (as mentioned above). That would help a lot. (Of course if we'll use Google Code Style, they already provide such files https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml .) On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: +1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org: +1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org wrote: +1 for enforcing a more strict Java code style. However, let's not introduce a line legth of 100 like in Scala. I think that's hurting readability of the code. On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote: On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org wrote: I'm in favor of strict coding styles. And I like the google style. +1 I would like that. We essentially all agree that we want more homogeneity and I think strict rules are the only way to go. Since this is a very subjective matter it makes sense to go with something (somewhat) well established like the Google code style.
Re: [DISCUSS] Issues with heterogeneity of the code
I'm already always sticking to the official Scala style guide, with the exception of 100 line length. On Mar 16, 2015 3:27 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for stricter Java code styles. I haven't looked into the Google Code Style but maybe we make it easier for new contributors if we apply a coding style which is somehow known. +1 for line length of 100 for Scala code. I think it makes code review on GitHub easier. For the Scala style, we could stick to official style guidelines [1]. [1] http://docs.scala-lang.org/style/ On Mon, Mar 16, 2015 at 3:06 PM, Hermann Gábor reckone...@gmail.com wrote: +1 for the stricter Java code styles. We should not forget about providing code formatter settings for Eclipse and Intellij IDEA (as mentioned above). That would help a lot. (Of course if we'll use Google Code Style, they already provide such files https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml .) On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: +1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org: +1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org wrote: +1 for enforcing a more strict Java code style. However, let's not introduce a line legth of 100 like in Scala. I think that's hurting readability of the code. On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote: On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org wrote: I'm in favor of strict coding styles. And I like the google style. +1 I would like that. We essentially all agree that we want more homogeneity and I think strict rules are the only way to go. Since this is a very subjective matter it makes sense to go with something (somewhat) well established like the Google code style.
Re: [DISCUSS] Issues with heterogeneity of the code
Regarding style, yes, we already have them in place but they are very loose, especially in Java. I guess it is a no good deed goes unpunished scenario. To tighten up the style rules, for example following Google Java style with some documented exceptions, will require massive code changes. But we already do large code changes or refactoring in some parts of the code tree. Some current PRs are already huge in size. - Henry On Fri, Mar 13, 2015 at 11:51 AM, Stephan Ewen se...@apache.org wrote: We already have checkstyle for Java and Scala in place (with marking violations a breaking the build). The rules in Java are very loose, though. We may make them stricter. Would require extensive passes over a lot of code, though, to fix this. The other things (choice of library) seem to be well addressable and we can document and enforce them immediately, if we want. Stephan On Mon, Mar 9, 2015 at 6:26 PM, Henry Saputra henry.sapu...@gmail.com wrote: Thanks for bringing up the discussions, Stephan. Ufuk and Till had brought up some ideas to solve the example of issues you mentioned in the original thread. So In the nutshell, we need to have more strick style and rules checking for the code to help contributors to submit code change and maintain bit more homogenous in style and code flow. Some ideas of concrete steps I could think of and have done it in other projects: 1. Add Google Java code style [1] as the checkstyle rule and document the differences from the rule (For example, tabs instead of spaces). Set it as break build if style is violated. We did similar thing in Tachyon [2] and seemed to help. 2. Declare Scala code style [3] as the main code style for Scala portion of the code and enforce it. Similar to Java add documentation for differences or details that is not covered by the rules. 3. Add code guide for stuff which are not covered by the rules, such as parameter checking to use Guava or common-lang3 and enforce it by code review. With this do global code change to reflect this. Just my 2-cents, and as the others had mentioned, we need to make this as explicit - Henry [1] https://google-styleguide.googlecode.com/svn/trunk/javaguide.html [2] http://tachyon-project.org/Startup-Tasks-for-New-Contributors.html [3] http://docs.scala-lang.org/style/ On Mon, Mar 9, 2015 at 2:35 AM, Till Rohrmann till.rohrm...@gmail.com wrote: I also agree that we have too many different ways of doing things. A set of common rules/ways would definitely be beneficial for the project. Concerning the command line parsing: I thought that Alexander Alexandrov wanted to unify the command line parsing by replacing both tools with a better one. So this should be fixed sooner or later. The different code styles are just natural if there is no common set of established rules, which are also enforced. I see two solutions: Either enforcing common coding rules or refraining from reformatting the code from other contributors. I don't know whether we can find a common denominator with which everyone can live and which is yet specific enough to make the code base more homogenous. I also agree that the mixed Java/Scala projects make it harder to get started. I've often seen that people confuse the basic types (Scala tuples vs. our own tuples, Java list vs. Scala list, etc.). This is probably something we cannot fix without rewriting parts which are implemented in the other language, though. Personally I don't see the different testing styles as critical. Whether one is using JUnit tests, WordSpecLike tests or FlatSpec tests, it should be pretty obvious for everyone where the testing code is written in case that they want to change something. Moreover, most of the time you would not want to change proper defined test cases. In fact, I think that WordSpecLike and FlatSpec let you write better tests, because it encourages people to clearly and more naturally write what they want to test. Compared to some shorter and often not so meaningful junit method names, this is for me a clear advantage. If a test case fails, I want to directly know without having to go through the testing code, what was tested and what went probably wrong. However, this holds only true if we don't use the JUnitRunner to run these tests. Unfortunately, this is currently the case. With JUnitRunner, the actual result output of the tests is often hard to decode. Therefore, we should probably stick to writing testing methods if we use JUnit to run our Scala tests but use the more expressive FlatSpec for pure Scala modules. That said, I would be in favour of some explicitly stated guidelines, because Lets' keep it in mind will be forgotten soon. On Mon, Mar 9, 2015 at 8:46 AM, Ufuk Celebi u...@apache.org wrote: Hey Stephan, On 08 Mar 2015, at 23:17, Stephan Ewen se...@apache.org wrote: Hi everyone! I would like to
Re: [DISCUSS] Issues with heterogeneity of the code
Agree. We have make decision either to play tight or loose on the code style and guide. Once the codebase is getting too large and more committers coming in then it would be too late. We can not have our cake and eat it too. Looking forward to what others think since I already have my 2-cents out =) - Henry On Fri, Mar 13, 2015 at 2:31 PM, Fabian Hueske fhue...@gmail.com wrote: I personally find the Google code style to be too strict/detailed. Loosening it by dropping certain rules makes only sense if the deviation does not become to large. My major concern with adding a such strict code style is that all open PRs would become invalid. We could try to reduce that effect by adding the code styling module by module and primarily resolving PRs that address the next module. However, this would still be a huge effort and I am not sure if it would pay back. In any case, it is good to have this discussion now. Postponing the decision will make it only more costly if we agree on a more rigid code style. 2015-03-13 20:18 GMT+01:00 Henry Saputra henry.sapu...@gmail.com: Regarding style, yes, we already have them in place but they are very loose, especially in Java. I guess it is a no good deed goes unpunished scenario. To tighten up the style rules, for example following Google Java style with some documented exceptions, will require massive code changes. But we already do large code changes or refactoring in some parts of the code tree. Some current PRs are already huge in size. - Henry On Fri, Mar 13, 2015 at 11:51 AM, Stephan Ewen se...@apache.org wrote: We already have checkstyle for Java and Scala in place (with marking violations a breaking the build). The rules in Java are very loose, though. We may make them stricter. Would require extensive passes over a lot of code, though, to fix this. The other things (choice of library) seem to be well addressable and we can document and enforce them immediately, if we want. Stephan On Mon, Mar 9, 2015 at 6:26 PM, Henry Saputra henry.sapu...@gmail.com wrote: Thanks for bringing up the discussions, Stephan. Ufuk and Till had brought up some ideas to solve the example of issues you mentioned in the original thread. So In the nutshell, we need to have more strick style and rules checking for the code to help contributors to submit code change and maintain bit more homogenous in style and code flow. Some ideas of concrete steps I could think of and have done it in other projects: 1. Add Google Java code style [1] as the checkstyle rule and document the differences from the rule (For example, tabs instead of spaces). Set it as break build if style is violated. We did similar thing in Tachyon [2] and seemed to help. 2. Declare Scala code style [3] as the main code style for Scala portion of the code and enforce it. Similar to Java add documentation for differences or details that is not covered by the rules. 3. Add code guide for stuff which are not covered by the rules, such as parameter checking to use Guava or common-lang3 and enforce it by code review. With this do global code change to reflect this. Just my 2-cents, and as the others had mentioned, we need to make this as explicit - Henry [1] https://google-styleguide.googlecode.com/svn/trunk/javaguide.html [2] http://tachyon-project.org/Startup-Tasks-for-New-Contributors.html [3] http://docs.scala-lang.org/style/ On Mon, Mar 9, 2015 at 2:35 AM, Till Rohrmann till.rohrm...@gmail.com wrote: I also agree that we have too many different ways of doing things. A set of common rules/ways would definitely be beneficial for the project. Concerning the command line parsing: I thought that Alexander Alexandrov wanted to unify the command line parsing by replacing both tools with a better one. So this should be fixed sooner or later. The different code styles are just natural if there is no common set of established rules, which are also enforced. I see two solutions: Either enforcing common coding rules or refraining from reformatting the code from other contributors. I don't know whether we can find a common denominator with which everyone can live and which is yet specific enough to make the code base more homogenous. I also agree that the mixed Java/Scala projects make it harder to get started. I've often seen that people confuse the basic types (Scala tuples vs. our own tuples, Java list vs. Scala list, etc.). This is probably something we cannot fix without rewriting parts which are implemented in the other language, though. Personally I don't see the different testing styles as critical. Whether one is using JUnit tests, WordSpecLike tests or FlatSpec tests, it should be pretty obvious for everyone where the testing code is written in case that they want to change something. Moreover, most of the time you
Re: [DISCUSS] Issues with heterogeneity of the code
Hey Stephan, On 08 Mar 2015, at 23:17, Stephan Ewen se...@apache.org wrote: Hi everyone! I would like to start an open discussion about some issue with the heterogeneity of the Flink code base. Thanks for bringing this up. I agree with your position. The related discussion about using Guava vs. Validate is a good step into the right direction. In general, I think it's super hard to get more homogeneity without enforcing rules (like in the Guava/Validate discussion). I would be OK with trying to settle on rules and then enforcing them. But I'm not sure whether that is what you are asking for? Are you more aiming at a Let's keep it in mind kind of thingy?! Here are a few examples: - Parameter checking is sometimes done with commons-lang3, commons-lang, or guava - Command line parsing is sometimes done with commons-cli, sometimes with scopt. I think these are easily enforceable and could also be changed manually w/o too much hassle. - Code styles are quite different from commit to commit. Spaces, indentations, braces. Not a critical thing, but seems to encourage people to reformat other people's code, whenever the pass over it, which should be avoided (cluttered diffs, may introduce new bugs actually) This is something we could more strictly enforce in pull requests and generally ask people to refrain from. - Some projects are mixed Java/Scala, which is not perfectly supported by the tools so far. It also needs many fromJava / toJava conversions and makes the entry hurdle into the project higher. - Tests are sometimes written as Java Unit tests, sometimes as Scala Unit tests (method style), sometimes as Scala Unit Tests (grammar style). This is an artifact of the mixed Scala/Java discussion. I agree that this can be problematic, but I'm not sure how to solve this as long as we mix Java/Scala in the same modules?! For new code in the runtime, we could stick to one language. What do you propose here as a solution? I am eager to hear opinions! As I've said, I agree with your points, but I think a big issue for new comers and committers alike is missing documentation in the code. We should try to keep the discussion we had in that regard in mind as well. – Ufuk
[DISCUSS] Issues with heterogeneity of the code
Hi everyone! I would like to start an open discussion about some issue with the heterogeneity of the Flink code base. We have, since the beginning in Apache (and even since we started the predecessor project, Stratosphere) refrained from strictly enforcing conventions like formatting, style, or libraries. I like the idea behind it, that committers and contributors are not forced into a corset of hundreds of rules before they can contribute something. As the project is growing, more and more people with different backgrounds have joined, and the project has grown a bit heterogeneous in several parts. In many cases, not necessarily due to need for different functionality, but simply due to roll your own style. I think this is starting to become a bit of an issue. Here are a few examples: - Parameter checking is sometimes done with commons-lang3, commons-lang, or guava - Command line parsing is sometimes done with commons-cli, sometimes with scopt. - Code styles are quite different from commit to commit. Spaces, indentations, braces. Not a critical thing, but seems to encourage people to reformat other people's code, whenever the pass over it, which should be avoided (cluttered diffs, may introduce new bugs actually) - Some projects are mixed Java/Scala, which is not perfectly supported by the tools so far. It also needs many fromJava / toJava conversions and makes the entry hurdle into the project higher. - Tests are sometimes written as Java Unit tests, sometimes as Scala Unit tests (method style), sometimes as Scala Unit Tests (grammar style). Not all things need to be unified across the entire Flink code base. But it becomes harder to switch between projects, even for seasoned Flinksters. And it becomes a hurdle for new contributors, which is very critical. I, personally, would like to encourage people to keep this in mind. Easier understanding of the code and easier entry for newcomers (for which a certain homogeneity helps quite a bit) should have a higher priority than the desire to stick to the personal favorite code style or library. This is a big community effort, after all. That said, we should not, of course, block of the use of new libraries/languages/features when they have significant benefit over the existing state. I am eager to hear opinions! Stephan