Hi, Thank you for clarifying the tradeoffs and choices. I've updated my pull request for your review, as far as I can tell it meets the choices.
Now there are 3 scenarios: 1) There is no properties file --> everything returns a "default" value. These are the defaults I have chosen: Version : <unknown> ScalaVersion : <unknown> BuildTime : 1970-01-01T00:00:00Z BuildTimeString : 1970-01-01T00:00:00+0000 GitCommitId : DecafC0ffeeD0d0F00d GitCommitIdAbbrev : DeadD0d0 GitCommitTime : 1970-01-01T00:00:00Z GitCommitTimeString : 1970-01-01T00:00:00+0000 2) A poorly generated properties file exists --> Only the bad values return a default. This happens because the maven git plugin is not run by the IDE. Example: project.version=1.11-SNAPSHOT scala.binary.version=2.11 git.commit.id=${git.commit.id} git.commit.id.abbrev=${git.commit.id.abbrev} git.commit.time=${git.commit.time} git.build.time=${git.build.time} 3) A full generated properties file exists --> Everything returns the right value. Example: project.version=1.11-SNAPSHOT scala.binary.version=2.11 git.commit.id=8e2642ec0046bb02fe9c1648b589051e66c2f956 git.commit.id.abbrev=8e2642e git.commit.time=2020-04-28T09:22:20+0200 git.build.time=2020-04-28T10:06:42+0200 Niels On Tue, Apr 21, 2020 at 2:00 PM Chesnay Schepler <ches...@apache.org> wrote: > I've had an offline discussion with Till and we came to the following > conclusion: > > All out-lined approaches work perfectly when Flink is built in it's > entirety with maven. > None of the out-lined approach work perfectly when Flink is _not_ built in > it's entirety. > > The source-file generation approach has similar problems as the > properties-file one; if I build flink-runtime, switch to another branch, > build flink-dist, then the information that will be displayed in the UI > when I start the cluster may be incorrect. > Put another way, you can still easily run into the issue of "doesn't work > on *my* machine", the solution to which always is to rebuilt Flink > completely. > Frustratingly enough, with both approaches you can even run into the issue > where the information _seems_ correct, when it actually isn't; but this > isn't something that we can really solve. > > As such, this is no longer really about correctness, but one about > convenience and maintenance overhead. > > As for maintenance, I think a good argument can be made that the > generation approach would be easier to maintain, in parts because it is the > more standard approach for this kind of thing. > > The properties-file approach however clearly wins in convenience as it > neither requires an additional command to be run and doesn't cause compile > errors. The compile error is quite a problem indeed since it provides no > information as to what the actual problem is; or rather how to solve it. > > Overall we concluded that we would like to stick to the existing approach > of having a properties file (or JSON or whatever). > > I saw that you updated the PR, and will take a look shortly. > > On a personal note, I'm sympathetic to your idea, but people are quite > sensitive to things interrupting their work-flow (which I ran into a few > times already :/ ), so we put a lot of emphasis on that aspect. > > On 16/04/2020 16:05, Niels Basjes wrote: > > Hi, > > Apparently the IDEs (like IntelliJ) are imperfect at supporting the DEVs in > these kinds of use cases. > The solutions you see are like you have in IntelliJ: a "generate-sources" > button. > It is a way of working I see in almost all projects I'm involved with: > Almost all either use a parser generator (like javacc or Antlr) or generate > code from a specification (like Avro or Protobuf). > In all of these cases there is no "default" value you can use so this > discussion never arose in those projects. > > All other options (other than 'hard' generating a file/class without having > defaults) involve having some kind of "default" (regardless if it is a > property file, a Json file or a Java interface). > This will lead to the illusion of "it works on my machine" or "why doesn't > it work on my machine". > What I also see is that these options which include "defaults" also have > quite a bit of extra code to handle all the situations related to loading > the settings (either through resource loading or classloading). > > The goal I was working on is that the docker image that is loaded is tagged > with a flink+scala version. > If the DEV does not generate the right values or has values from a > different branch then a different docker image may be used then what is > expected leading to "why doesn't it work on my machine" . > > So yes it is an additional learning step for the DEVs and yes it takes a > bit of getting used to. > Yet I truly believe it is the best direction. > > Niels > > > On Wed, Apr 15, 2020 at 5:26 PM Till Rohrmann <trohrm...@apache.org> > <trohrm...@apache.org> wrote: > > > I'm not advocating for a specific approach. The point I wanted to make is > that there are solutions which allow us to get rid of the problematic > parsing and not disrupting the workflow. If the Jackson JSON file approach > works for Niels, then I'm fine with that as well. > > Cheers, > Till > > On Wed, Apr 15, 2020 at 5:21 PM Chesnay Schepler <ches...@apache.org> > <ches...@apache.org> > wrote: > > > It doesn't have to be a properties file, nor do we necessarily have to > do any manual parsing. > It could just be a JSON file that we point Jackson at. Naturally we > could also generate it with Jackson. > You'd have a POJO for all the fields with sane defaults (an analogue to > the proposed generated POJO) and 5-6 lines at most for loading the file > and handling errors. > > I don't think this would be such a problem. > > If I understand Till correctly he's proposing a service-loader approach, > which seems overkill to me. It also introduces some inconsistency in > what one needs to do before working on stuff which I'm apprehensive > > about. > > On 15/04/2020 16:31, Till Rohrmann wrote: > > Hi everyone, > > thanks for starting this discussion Niels. > > I like the idea of getting rid of parsing a Properties instance. On the > other hand, I also understand that people are concerned about > > disrupting > > the workflows of our devs. > > Maybe we can find a compromise between both approaches. For example, > > one > > could define an EnvironmentInformationProvider interface which is > > loaded > > at > > runtime. The implementation of this interface could be generated by > > `mvn > > generate-sources`. Moreover, if we cannot find it, then we fail with a > descriptive exception saying to run `mvn generate-sources`. That way > > only > > devs who rely on the EnvironmentInformation have to run `mvn > generate-sources` when testing/developing and we could still get rid of > > all > > the parsing and logic of a properties file. What do you think? > > Cheers, > Till > > On Mon, Apr 13, 2020 at 10:53 AM Niels Basjes <ni...@basjes.nl> > <ni...@basjes.nl> wrote: > > > Hi, > > On Mon, Apr 13, 2020 at 8:30 AM Yang Wang <danrtsey...@gmail.com> > <danrtsey...@gmail.com> > > wrote: > > Although the latter option is more stable, > i think it is not acceptable for all the developers to execute `mvn > generate-sources` first. > > Otherwise, the Flink project is just broken and could not run tests, > > Flink > > jobs in IDE. > > > > It is important to realize that this discussion is essentially around > > the > > fact that systems like IntelliJ (what I use) execute various ways of > generating code that have been written in a maven pom.xml in different > ways. Simply put: They don't execute all instructions that have been > defined in the pom.xml automatically when needed. > > - With the properties file variant (which uses resource filtering) > IntelliJ does resource filtering automatically yet no maven > > plugins > > are > > run that affect the properties that are replaced. > So the git variables are NOT populated when running in IntelliJ > > and > > I > > have seen several values in the propersites file are nonsense. > As a consequence the Java code reading those needs to catch things > > like > > "missing properties file" and various kinds of "nonsense values" > > and > > replace them with a workable default. > So when running this code you are actually running something > > different > > from what will be run when doing a `mvn clean install`, yet the > developer > is under the illusion it all works because of the code that > > catches > > all > > of > those problems. > Depending on the way these variables are used this may lead to "It > > fails > > in production but it works fine in IntelliJ" situations. > In my mind the code that catches all of those exceptional > > situations in > > poorly generated properties only exists so that developers can run > > the > > (otherwise perfectly fine) software in a "broken" > > build/development > > environment. > > > - With the way I propose to generate the code the effect is indeed > slightly harder: If you do not run the pom.xml based code > > generation it > > will not work. > I understand that this requires the developers to think a bit more > > about > > the code they are working on. > The upside is that the code either works perfectly or does not > > compile. > > There is no "it compiles but is really nonsense". > I prefer this. > Also at this point in time this is already true for Flink: There > > are > > quite a few modules where the developer MUST run mvn > > generate-sources > > for > all tests to run successfully. > Like I indicated there is a javacc SQL parser and there are a lot > > of > > tests that rely on generating Avro code before they are able to > > run. > > I have several projects where systems like Avro and Antlr force me > > in > > this direction, there is simply no other way to do this. > > So i think the version properties file is enough for now. +1 for the > > first > > option. > > > Like I said. I'm fine with either choice by the committers. > I would choose the `mvn generate-sources` code variant as it is much > simpler and much more reliable. > > Niels > > Niels Basjes <ni...@basjes.nl> <ni...@basjes.nl> 于2020年4月9日周四 下午4:47写道: > > Hi, > > I'm working on https://issues.apache.org/jira/browse/FLINK-16871 to > > make > > more build time variables (like the scala version) into the code > > available > > at runtime. > > During the review process there was discussion around a basic > > question: > > *Is > > generating java code during the build ok?* > See > > - > > https://github.com/apache/flink/pull/11245#discussion_r400035133 > > - https://github.com/apache/flink/pull/11592 > - > > https://github.com/apache/flink/pull/11592#issuecomment-610282450 > > As suggested by Chesnay Schepler I'm putting this question to the > > mailing > > list. > > > https://github.com/apache/flink/pull/11592#issuecomment-610963947 > > The main discussion was around the ease of use when running in an > > IDE > > like > > IntelliJ. > > So essentially we have two solution directions available: > > 1. *Generate a properties file and then use the classloader to > > load > > this > > file as a resource and then parse it as a property file.* > This is the currently used solution direction for this part of > > the > > code. > > A rough version of this (to be improved) : > > > > https://github.com/apache/flink/commit/47099f663b7644056e9d87b262cd4dba034f513e > > This method has several effects: > 1. The developer can run the project immediately from within > > the > > IDE > > as fallback values are provided if the 'actual' values are > > missing. > > 2. This property file (with stuff that should never be > > overwritten) > > can be modified by placing a different one in the classpath. > > In > > fact it IS > modified in the flink-dist as it generates a new file with > > the > > same > > name > into the binary distribution (I consider this to be bad). > 3. Loading resources means loading, parsing and a lot of > > error > > handling. Lots of things "can be null" or be a default > > value. > > So > > the > > values are unreliable and lots of code needs to handle this. > > In > > fact > > when > running from IntelliJ the properties file is generated poorly > > most > > of the > time, only during a normal maven build will it work > > correctly. > > 2. *Generate a Java source file and then simply compile this and > > make > > it > > part of the project.* > Essentially the same model as you would have when using Apache > > Avro, > > Protobuf, Antlr 4 and javacc (several of those are used in > > Flink!). > > A rough version of this (to be improved) : > > > > https://github.com/apache/flink/commit/d215e4df60dc9d647dcee1aa9a2114cbf49d0566 > > This method has several effects: > 1. The developer MUST run 'mvn generate-sources' before the > > actual > > the > > project immediately from within the IDE as fallback values > > are > > provided if > the 'actual' values are missing. > 2. The code/test will not run until this step is done. > 3. Because the file is generated by a plugin it is always > > correct. > > As > > a consequence all variables are always available and the > > downstream > > users > no longer have to handle the "can be null" or "default value" > situations. > > So is generating code similar to what I created a desired change? > My opinion is that it is the better solution, the data available is > > more > > reliable and as a consequence the rest of the code is simpler. > It would probably mean that during development of flink you should > > be > > aware > > of this and do an 'extra step' to get it running. > > What do you guys think? > > -- > Best regards / Met vriendelijke groeten, > > Niels Basjes > > > -- > Best regards / Met vriendelijke groeten, > > Niels Basjes > > > > -- Best regards / Met vriendelijke groeten, Niels Basjes