Ok, I see. So, let's generate on the fly. No problem at all!
I will try to add maven-proto-plugin and setup generated code as an independent package. On Fri, 2024-06-14 at 07:51 +0000, Weibin Zeng wrote: > Hi,Sem, thanks to bring up this topic. > > For Cpp, I prefer to incorporate generate code in the building > process. And GraphScope[1] > use this strategy in cpp/python, and works well. > > For Java/Scala, how about we make format as a independent package and > other modules rely it? > > [1] > https://github.com/alibaba/GraphScope/blob/main/analytical_engine/CMakeLists.txt#L288 > > Best, > Weibin Zeng > > On 2024/06/13 10:32:41 Sem wrote: > > Hello! > > > > Because we are switching to proto3 as a language of GAR format > > definition, we need to decide, are we going t store generated code > > in > > git or not. > > > > Pros of storing generated code: > > 1. Stability: even if the protoc is changed or a plugin is > > deprecated > > we are still having generated and compilable code in the repo; > > 2. Usability: anyone can go to git and see how the actual code > > looks > > like; also users and developers should not care about protoc/buf > > and > > can just clone the repo and thats it; > > 3. CI simplicity: we do not need to incorporate protoc/buf in the > > building process; > > > > Cons of storing generated code: > > 1. Huge git diffs: in my experience changing a single line in proto > > may > > tend to hundreds of lines diff in generated classes; > > 2. Generated by protoc code is actually unreadable and it does not > > help > > a lot in understanding what is going on; > > 3. Risk of outdated classes: I cannot imagine the way how to check > > that > > generated code is up to date. > > > > > > Sources of possible inspiration: > > 1. > > https://github.com/apache/spark/blob/master/dev/connect-check-protos.py > > : an utility in Apache Spark project that checks are the generated > > code > > up to date or not. We may try to implement the same for Java/Cpp > > too. > > 2. > > https://github.com/apache/spark/blob/master/dev/connect-gen-protos.sh > > : > > an utility in Apache Spark project that re-generate proto classes > > for > > PySpark and apply formatting to reduce the git diff. We may try to > > implement the same for Java/Cpp too. > > > > > > How it is done in Apache Spark itself: > > 1. proto files are incorporated into Maven build via maven-proto- > > plugin, so Java classes are not stored in the repo and are > > generated > > during the build > > 2. Python classes are stored in the repo and are generated/updated > > by > > request. In CI checking of sync status is called > > > > Another options. > > I had talks with some engineers and as I understood the best > > solution > > and an industry standard is to put all the protos in a sepearate > > repository with generation of classes and put these classes into > > packages. After that these packages may be used as dependencies. > > The > > problem here is that requires to split our monorepo into parts: > > harder > > to work with, harder to onboard people, harder to test, etc. > > > > Best regards, > > Sem > > > > ------------------------------------------------------------------- > > -- > > To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org > > For additional commands, e-mail: dev-h...@graphar.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org > For additional commands, e-mail: dev-h...@graphar.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org For additional commands, e-mail: dev-h...@graphar.apache.org