[protobuf] Issue 183 in protobuf: --java_out=output_list_file= parameter fails when compiling multiple proto files

2010-04-28 Thread protobuf

Status: New
Owner: ken...@google.com
Labels: Type-Defect Priority-Medium

New issue 183 by t.broyer: --java_out=output_list_file= parameter fails  
when compiling multiple proto files

http://code.google.com/p/protobuf/issues/detail?id=183

What steps will reproduce the problem?
Run protoc with multiple input files to generate Java and passing the
output_list_file parameter. For instance, derived from the command line
found in java/README.txt:

  $ protoc --java_out=output_list_file=generated_files.txt:src/main/java \
  -I../src ../src/google/protobuf/descriptor.proto \
  ../src/google/protobuf/compiler/plugin.proto

What is the expected output? What do you see instead?
generated.txt should contain the list of all generated files (in this case
two lines: com/google/protobuf/DescriptorProtos.java and
google/protobuf/compiler/PluginProtos.java), but instead protoc fails with
the following message:

  generated_files.txt: Tried to write the same file twice.

What version of the product are you using? On what operating system?
2.3.0 on Windows XP (using the precompiled protoc.exe)

Please provide any additional information below.
This unfortunately is a flaw in the CodeGenerator architecture: a
CodeGenerator is called for each .proto file to generate and cannot
maintain state (in this case it would probably be the vectorstring
all_files) between calls as there are no start/end hooks (to know when to
flush the list to the file).

The workaround is to invoke protoc repeatedly with only 1 .proto file and a
different output_list_file each time.

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups Protocol 
Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Issue 182 in protobuf: chromium protoc.exe linker errors when compiled, vs2005

2010-04-28 Thread protobuf


Comment #1 on issue 182 by ken...@google.com: chromium protoc.exe linker  
errors when compiled, vs2005

http://code.google.com/p/protobuf/issues/detail?id=182

These look like standard library symbols.  Are you possibly compiling  
against a different C++ runtime library

version than you are linking against?

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups Protocol 
Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Issue 183 in protobuf: --java_out=output_list_file= parameter fails when compiling multiple proto files

2010-04-28 Thread protobuf


Comment #1 on issue 183 by ken...@google.com: --java_out=output_list_file=  
parameter fails when compiling multiple proto files

http://code.google.com/p/protobuf/issues/detail?id=183

Yeah, this cannot be fixed without design changes.

Perhaps you could instead take advantage of .jar output mode?  If you  
output to a .jar, then you can enumerate

files in the .jar easily.

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups Protocol 
Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Issue 183 in protobuf: --java_out=output_list_file= parameter fails when compiling multiple proto files

2010-04-28 Thread protobuf


Comment #2 on issue 183 by t.broyer: --java_out=output_list_file= parameter  
fails when compiling multiple proto files

http://code.google.com/p/protobuf/issues/detail?id=183

I actually do not have a need for it myself, I just stumbled upon it while  
working on
my Java-based plugin (where I want to output a file listing other files  
generated by
each .proto file processed; namely a GWT module that inherits/ the GWT  
modules
generated for each .proto file), and it happened to me that it couldn't  
have been done

in C++ using the CodeGenerator API (and PluginMain) only.

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups Protocol 
Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Java UTF-8 encoding/decoding: possible performance improvements

2010-04-28 Thread Evan Jones

Evan Jones wrote:
Problem 2: Using the NIO encoders/decoders can be faster than  
String.getBytes, but only if it is used = 4 times. If used only  
once, it is worse. The same is approximately true about decoding.  
Lame results: http://evanjones.ca/software/java-string-encoding.html


I'm revisiting this old issue, thanks to being reminded about it by an  
earlier message. I've tested this with recent JVMs, and it still  
seems to hold true: using the NIO encoders and decoders can be faster  
than using String.getBytes(). My numbers show that encoding is  
approximately 40% faster, while decoding shows a smaller improvement.  
See the following for details on this microbenchmark:


http://evanjones.ca/software/java-string-encoding.html


This surprises me, since it suggests that Sun/Oracle could replace  
their implementation of String.getBytes() with something similar to my  
code, and get a performance improvement. In fact, with privileged  
access to the internals of a String, they should be able to do even  
better.


I've integrated this change into protobuf and with the microbenchmark  
in the protobuf source tree, it shows a performance improvement of  
~20% (numbers below). I'll send a code review with the code shortly,  
once I've cleaned it up a bit, in case anyone wants to look at it.



Pros:
+ Faster protocol buffer encoding.

Cons:
- Far more code to handle encoding (like an extra hundred lines or so)  
which means there could be bugs.

- Extra memory (~2 kB per thread for encoding)


I'm unsure if the benefits outweigh the costs, particularly since the  
fact this appears faster seems to be a surprising result, and I  
wouldn't be shocked to find that future JVM / JDK releases could make  
this optimization useless. I'll leave that decision to the protocol  
buffer maintainers.


Evan



Results: All the serialize results are better. The deserialized  
results are unchanged (as expected).



SpeedMessage1 (which is small):
Original: Serialize to byte string: 12360424 iterations in 29.83s;  
90.097984MB/s
Optimized: Serialize to byte string: 15911623 iterations in 29.997s;  
115.337776MB/s


SpeedMessage2 (larger):
Serialize to byte string: 33482 iterations in 29.754s; 90.757484MB/s
Serialize to byte string: 40381 iterations in 30.031s; 108.44853MB/s


Raw results on my Macbook (Core2 Duo CPU):

ORIGINAL
Benchmarking benchmarks.GoogleSpeed$SpeedMessage1 with file  
google_message1.dat

Serialize to byte string: 12360424 iterations in 29.83s; 90.097984MB/s
Serialize to byte array: 12244951 iterations in 30.303s; 87.86307MB/s
Serialize to memory stream: 8699469 iterations in 23.732s; 79.70642MB/s
Serialize to /dev/null with FileOutputStream: 6075179 iterations in  
27.764s; 47.578636MB/s
Serialize to /dev/null reusing FileOutputStream: 6975006 iterations in  
29.99s; 50.571175MB/s
Serialize to /dev/null with FileChannel: 10375092 iterations in  
29.864s; 75.54034MB/s
Serialize to /dev/null reusing FileChannel: 11166943 iterations in  
30.699s; 79.09427MB/s
Deserialize from byte string: 14463117 iterations in 30.06s;  
104.618355MB/s
Deserialize from byte array: 14436567 iterations in 30.007s;  
104.61074MB/s
Deserialize from memory stream: 6221772 iterations in 28.024s;  
48.274624MB/s


Benchmarking benchmarks.GoogleSpeed$SpeedMessage2 with file  
google_message2.dat

Serialize to byte string: 33482 iterations in 29.754s; 90.757484MB/s
Serialize to byte array: 33103 iterations in 29.517s; 90.45062MB/s
Serialize to memory stream: 28872 iterations in 29.939s; 77.77786MB/s
Serialize to /dev/null with FileOutputStream: 32934 iterations in  
29.927s; 88.756MB/s
Serialize to /dev/null reusing FileOutputStream: 32979 iterations in  
29.887s; 88.99622MB/s
Serialize to /dev/null with FileChannel: 32447 iterations in 29.921s;  
87.46108MB/s
Serialize to /dev/null reusing FileChannel: 32585 iterations in  
29.903s; 87.88594MB/s
Deserialize from byte string: 38388 iterations in 29.879s;  
103.620544MB/s

Deserialize from byte array: 38677 iterations in 29.866s; 104.446075MB/s
Deserialize from memory stream: 37879 iterations in 29.954s;  
101.990585MB/s


OPTIMIZED
Benchmarking benchmarks.GoogleSpeed$SpeedMessage1 with file  
google_message1.dat

Serialize to byte string: 15911623 iterations in 29.997s; 115.337776MB/s
Serialize to byte array: 16152646 iterations in 30.008s; 117.041954MB/s
Serialize to memory stream: 14859367 iterations in 29.551s;  
109.33597MB/s
Serialize to /dev/null with FileOutputStream: 7224915 iterations in  
29.954s; 52.446056MB/s
Serialize to /dev/null reusing FileOutputStream: 7479144 iterations in  
30.081s; 54.062305MB/s
Serialize to /dev/null with FileChannel: 12730586 iterations in  
30.025s; 92.193504MB/s
Serialize to /dev/null reusing FileChannel: 14024645 iterations in  
30.399s; 100.31538MB/s
Deserialize from byte string: 14390338 iterations in 29.958s;  
104.44631MB/s
Deserialize from byte array: 14496442 iterations in 30.142s;  
104.57414MB/s
Deserialize from memory 

[protobuf] Re: Issue 183 in protobuf: --java_out=output_list_file= parameter fails when compiling multiple proto files

2010-04-28 Thread protobuf


Comment #3 on issue 183 by ken...@google.com: --java_out=output_list_file=  
parameter fails when compiling multiple proto files

http://code.google.com/p/protobuf/issues/detail?id=183

Well, you're the first person ever to report this.  I suspect no one  
actually uses

this feature...

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups Protocol 
Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.