Hi,

I'm working on https://issues.apache.org/jira/browse/FLINK-16871 to make
more build time variables (like the scala version) into the code available
at runtime.

During the review process there was discussion around a basic question: *Is
generating java code during the build ok?*
See

   - https://github.com/apache/flink/pull/11245#discussion_r400035133
   - https://github.com/apache/flink/pull/11592
   - https://github.com/apache/flink/pull/11592#issuecomment-610282450

As suggested by Chesnay Schepler I'm putting this question to the mailing
list.
  https://github.com/apache/flink/pull/11592#issuecomment-610963947

The main discussion was around the ease of use when running in an IDE like
IntelliJ.

So essentially we have two solution directions available:

   1. *Generate a properties file and then use the classloader to load this
   file as a resource and then parse it as a property file.*
   This is the currently used solution direction for this part of the code.
   A rough version of this (to be improved) :
   
https://github.com/apache/flink/commit/47099f663b7644056e9d87b262cd4dba034f513e
   This method has several effects:
      1. The developer can run the project immediately from within the IDE
      as fallback values are provided if the 'actual' values are missing.
      2. This property file (with stuff that should never be overwritten)
      can be modified by placing a different one in the classpath. In
fact it IS
      modified in the flink-dist as it generates a new file with the same name
      into the binary distribution (I consider this to be bad).
      3. Loading resources means loading, parsing and a lot of error
      handling. Lots of things "can be null" or  be a default value. So the
      values are unreliable and lots of code needs to handle this. In fact when
      running from IntelliJ the properties file is generated poorly most of the
      time, only during a normal maven build will it work correctly.
   2. *Generate a Java source file and then simply compile this and make it
   part of the project.*
   Essentially the same model as you would have when using Apache Avro,
   Protobuf, Antlr 4 and javacc (several of those are used in Flink!).
   A rough version of this (to be improved) :
   
https://github.com/apache/flink/commit/d215e4df60dc9d647dcee1aa9a2114cbf49d0566
   This method has several effects:
   1. The developer MUST run 'mvn generate-sources' before the actual the
      project immediately from within the IDE as fallback values are
provided if
      the 'actual' values are missing.
      2. The code/test will not run until this step is done.
      3. Because the file is generated by a plugin it is always correct. As
      a consequence all variables are always available and the downstream users
      no longer have to handle the "can be null" or "default value" situations.

So is generating code similar to what I created a desired change?
My opinion is that it is the better solution, the data available is more
reliable and as a consequence the rest of the code is simpler.
It would probably mean that during development of flink you should be aware
of this and do an 'extra step' to get it running.

What do you guys think?

-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Reply via email to