Ok, I see.

So, let's generate on the fly. No problem at all!

I will try to add maven-proto-plugin and setup generated code as an
independent package.

On Fri, 2024-06-14 at 07:51 +0000, Weibin Zeng wrote:
> Hi,Sem, thanks to bring up this topic.
> 
> For Cpp, I prefer to incorporate  generate code in the building
> process. And GraphScope[1]
> use this strategy in cpp/python, and works well.
> 
> For Java/Scala, how about we make format as a independent package and
> other modules rely it?
> 
> [1]
> https://github.com/alibaba/GraphScope/blob/main/analytical_engine/CMakeLists.txt#L288
> 
> Best,
> Weibin Zeng
> 
> On 2024/06/13 10:32:41 Sem wrote:
> > Hello!
> > 
> > Because we are switching to proto3 as a language of GAR format
> > definition, we need to decide, are we going t store generated code
> > in
> > git or not.
> > 
> > Pros of storing generated code:
> > 1. Stability: even if the protoc is changed or a plugin is
> > deprecated
> > we are still having generated and compilable code in the repo;
> > 2. Usability: anyone can go to git and see how the actual code
> > looks
> > like; also users and developers should not care about protoc/buf
> > and
> > can just clone the repo and thats it;
> > 3. CI simplicity: we do not need to incorporate protoc/buf in the
> > building process;
> > 
> > Cons of storing generated code:
> > 1. Huge git diffs: in my experience changing a single line in proto
> > may
> > tend to hundreds of lines diff in generated classes;
> > 2. Generated by protoc code is actually unreadable and it does not
> > help
> > a lot in understanding what is going on;
> > 3. Risk of outdated classes: I cannot imagine the way how to check
> > that
> > generated code is up to date.
> > 
> > 
> > Sources of possible inspiration:
> > 1.
> > https://github.com/apache/spark/blob/master/dev/connect-check-protos.py
> > : an utility in Apache Spark project that checks are the generated
> > code
> > up to date or not. We may try to implement the same for Java/Cpp
> > too.
> > 2.
> > https://github.com/apache/spark/blob/master/dev/connect-gen-protos.sh
> >  :
> > an utility in Apache Spark project that re-generate proto classes
> > for
> > PySpark and apply formatting to reduce the git diff. We may try to
> > implement the same for Java/Cpp too.
> > 
> > 
> > How it is done in Apache Spark itself:
> > 1. proto files are incorporated into Maven build via maven-proto-
> > plugin, so Java classes are not stored in the repo and are
> > generated
> > during the build
> > 2. Python classes are stored in the repo and are generated/updated
> > by
> > request. In CI checking of sync status is called
> > 
> > Another options.
> > I had talks with some engineers and as I understood the best
> > solution
> > and an industry standard is to put all the protos in a sepearate
> > repository with generation of classes and put these classes into
> > packages. After that these packages may be used as dependencies.
> > The
> > problem here is that requires to split our monorepo into parts:
> > harder
> > to work with, harder to onboard people, harder to test, etc.
> > 
> > Best regards,
> > Sem
> > 
> > -------------------------------------------------------------------
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org
> > For additional commands, e-mail: dev-h...@graphar.apache.org
> > 
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org
> For additional commands, e-mail: dev-h...@graphar.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org
For additional commands, e-mail: dev-h...@graphar.apache.org

Reply via email to