Lately I have been studying the source code to understand the internals. One thing that really surprised me was that a lot of code throughout Flink was very similar to Spark.
Open source projects learn from each other and apply similar ideas. However, I am not talking about applying similar ideas. I am talking about literal copy of code. Many files seemed like they were created by copy-pasting code directly from Spark and then renaming the variable names to avoid looking identical. As I study more, I find "copy-pasted" code throughout Flink, from actors to machine learning to analyzer to code generation. A few files have attribution, but most of them do not. I thought Flink was more advanced. Why?