New issue 263 by Proto parser for Java

Request for enhancement to the Java library to include a Java port of the proto parser (.proto file <=> DescriptorProtos)

Discussion moved from issue 247...

I've been thinking a lot of the use of run-time usage of .proto files in java (I use this technique) and I've gone through two uses now - both involve invoking protoc from java and then using the output. The first was to perform actual code generation, compile, and load. The second (and better i think) was to use the FileDescriptorSet output and load that into FileDescriptor(s).

The primary limitation of both approaches was the need to
1) invoke protoc through the "shell" (not very platform independent)
2) manipulate files when everything could/should be done in memory

The more I think on it, the more I think taking the "this is an advanced use of protobuf" approach and saying the most flexible solution to users that want to do this kind of work is to provide a java version of the current C library used to translate between ".proto" files and "FileDescriptorProto" protobufs.

This is the one big piece missing from the run-time java capability, everything but going .proto <=> FileDescriptorProto is in there already.

I'm interested in what Kenton or other active developers of the library think about this - how difficult would it be to port the C code to java? how much of a maintenance problem would this make? I would be willing to assist with this effort if it is thought a useful contribution to the java library. If it is believed this is a worthwhile pursuit, I believe that would supersede the need for console output from protoc.


The C++ parser and descriptor validation code is a lot more complicated than you might imagine. The most difficult part is custom options: a custom option can be used in the same file where it is defined -- even before or *inside of* its own definition.

So, porting would be difficult, and likely buggy. Moreover, keeping the code in sync with the C++ parser would be annoying.

On the other hand, it would be very easy to write a comprehensive test -- just parse all the unit test .proto files using both the C++ and Java parsers and verify that identical DescriptorProtos come out.

After years of arguing against a Java port of the parser, I am starting to think that it may be time to give in... Even inside Google we have a hard time stopping Java users from writing their own .proto parsers (which are almost always broken).

